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(57) Abstract: Methods of prodndx^ cyclic peptides and splicing in- 
termediates of peptides in a looped conformation are disclosed. The 
methods utilize the trans-splicing ability of split inteins to catalyze cy- 
clizatioh of peptides fiom a precursor peptide having a target peptide 
interposed between two portions of a split intein. The interaction of 
the two portions of the split intein creates a catalytically-active intein 
and also forces the target peptide into a loop configuration that stabi- 
lizes the ester isomer of the amino acid at the junction between one of 
the intein portions and the taiget peptide. A heteroatom finom the other 
intein portion then reacts with the ester to form a cyclic ester interme- 
diate. The active intdn catalyzes the formation of an aminosuccinimide 
that liberates a cyclized form of the target peptide, which ^ntaneously 
rearranges to form the theraiodynamically favored backbone cyclic pep- 
tide product .Also disclosed are nucleic acid molecules, polypeptides, 
methods for making cyclic peptides, methods of making libraries, and 
methods of screening peptides. 
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CYCLIC PEPTIDES 

Cross Reference To Related Applications 
The present application claims the benefit of U.S- provisional patent plication 
Serial No. 60/112,723 ffledDecember 18, 1998 and U.S. provisional patent application 
Serial No. • entitled **Production of Cyclic Peptides and Proteins In Vivo" filed 
October 7, 1999, both of which are incorporated herein by reference. 

Statemmt As To Federa llv^Snonsored Research 
Thisi invention was made with Government support mider grants GM13306 and 
GMr9891 awarded by the National Institiites of HealflL The Government may have 
certainrights in the inventioxL 

Field Of The Invention 
The invention relates to the field of biochetnistty. More particularly, the invention 
relates to cyclic peptides, methods fiir the making cyclic peptides, and methods of 
screening cyclic peptides for particular characteristics. 

Background Of The Invention 
Small linear peptides are usefiil for investigating various physiological phoaomena 
because they exhibit a wide range of biological activities and can be easily synthesized in 
almost infinitely variable sequences utUizmg conventional techniques in solid phase synthesis 
and combinatorial chemistry. These qualities also make small linear peptides especially 

usefiil for identifying and developing new dmgs. For example, large Ubraries of myriad 

/ 
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different CTr^ftH linear peptides can be pr ep a red synthetically and then screened for a 
particular characteristic in various biological assays. E.g., Scott, J. K. and G. P. Smith, 
Science 249:386, 1990; Devlin, J. J,, et aL, Science 24:404, 1990; Furka, A. et al., Int J. 
Pept. Protein Res. 37:487, 1991; Lam, EL S., et al., Nature354:82, 1991. Those 
5 peptides mfhin the library that exhibit the particular characteristic can then be isolated as 
candidates for further study. Aficrosequencing or other chemical analyses can then be used 
to characterize selected pqptides by, fofr exanq>le, amino acid sequence. Despite these 
advantages, only a handful of small linear peptides have been developed into widely-used 
pharmaceutical drugs. One reason for this is that small linear peptides are usually cleared 

10 ftomthebody toon^idly tobeofthCT^eutic value 

Ring closure, or cyclization, can reduce the rate at which peptides are degreed in 
vivo and therefore dramatically improve their pharmocokinetic properties. The majority of 
cyclic peptides of known therapeutic value have been identified after isolation from natural 
sources (e.g., calcitonins, oxytocin, and vasopressin). Unfortunately, the pool of naturally- 

15 existing cyclic peptides that can be screened for a particular biological activi^ is inherently 
limited. Aiid, moreover, the onerous steps required to isolate and purify cycHcpe^ 
from natural sources render such screens costiy and impractical. Thus, synthetic methods 
for producing large numbers of different peptides of infinitely variable amino acid sequences 
would greatiy facilitate identifying particular cyclic peptides as candidates for new drags. 

20 VarioTis methods for producing cyclic peptides have been described. For example, 

chemical reaction protocols, such as those described in U.S. Patent Nos. 4,033,940 and 

4,102,877, have been devised to produce drcularized peptides. In other techniques, 

biological and chemical methods are combined to produce cyclic peptides. These latter 

methods involve first expressing linear precursors of cyclic peptides in cells (e.g., bacteria) 
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to produce linear precursors of cyclic peptides and then adding of an exogenous agent such 
as a protease or a nuclepphilic reagent to chemically convert tiiese linear precursors into 
cyclic peptides. See, e.g., Camerero, J. A., and Muir, T. W., J. Am. Chem. Society. 
121:5597 (1999); Wu, H. et al, Proc. Natl. Acad Sci. USA, 95:9226 (1998). 

Once produced, cyclic peptides can be screened for pharmacological activity. For 
example, a library containing large numbers of different cyclic peptides can be prepared 
and tiien screened for a particular characteristic, such as the ability to bind a specific target 
ligand. The library is mixed wifli the target ligand, and those members of the library that 
bind to the target ligand can be isolated and identified by amino acid sequencing. Similarly, 
libraries of cyclic peptides can be added to assays for a specific biological activity. Those 
cyclic peptides which modulate the biological activity can then be isolated and identified by 
sequencing. . 

Unfortunately, because the step of identifying the active peptides can be difBcult, 
these screening assays can prove laborious and time-consuming. For instance, screening 
assays usually m?!n^?t« a reverse-mapping step because the actual amount of cyclic peptide ' 
that binds a target Ugand or modulates a biological activity is usually so minute that it cannot 
be sequenced directly. To avoid this problem^ a m^ indicating the physical location of the 
various cycUc peptides comprising a library can be made. Ahquots of cyclic peptides £com 
the dififerent locations are then transfeired to corresponding locations within the screening 
assa^ and those areas in the assay that exhibit the screened-for activity (e.g., binding or 
modulation of biological activity) are then mapped back to their correspondrng location in 
the library. The cycUc peptides in that area of the library can then be isolated and 

sequenced. DifiQculties arising fiom tiie need for spatial resolution and the limitations 

// 

imposed by sanqple handling limit the number of candidate pqptides that can be screened in 
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any given period of time. 

The number of peptides that can be screened in an assay can be dramatically 
increased by using cells that express tiie peptides. For example, bacteria engineered to 
express a library of linear peptides can be added to a screening assay, and those bacteria 
5 that express the screened for characteristic can be picked di^ The 
picked bacteria can flien be reproduced to large numbers such that the selected linear? 
peptides can made in large quantities to facilitate theu: identification (e.g., by sequencing) - 
and production. Making and screening small linear peptide libraries in vivo has, however, - 
proven to be troublesome because small linear peptides are rstpidly degraded by normal « 

10 cellular metabolic processes. Cyclization of the peptides can circumvent this problem by 
rendering ibe peptides stable within a cell. ' 

Despite this, heretofore, intracellular production of large libraries of cyclic peptides 
has not been feasible because general, easy-tq^perfonn methods for cyclizmg peptides in . 
vivo have not been available. For example, a known method of produdngcyclic peptides in 

15 vivo utilizes non-ribosomal peptide synthetase CNRPS) complexes (Cane et al. Science 
282:63, 1998). Such NRSP complexes are, however, neither facile to woik with nor 
genially useful for the production of more than a single cyclic pq)tide at a time. Mweover, 
unlifce ribosomal peptide synthesis where the linear sequence of monomers (amino acids) is 
dictated by the linear sequence of bases in the nucleic acid molecule encoding it, the Imear 

20 sequence of monomers in a pq)tide made by the NRPS method is dictated by the subtmit • 
organization of the NRPS complex. Changing the sequence of a cyclic peptide made by 
IJRPS entails cloning the subunit(s) which incorporate the desired monomers and 
introducing the subunit(s) into host ceUs already harboring all of the otiier necessary 

/ 

subunits. Making a library using this technique would require introducing combinations 
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(both in composition and order) of NRPS subimits to host cells and devising a method for 
ensuring that the subunits assemble into the correct siq^ramolecular stractures. 

<Z^rmm^lTv Of TThe Invention 
A general method for the in vivo production and screening of cyclic peptide 
5 libraries has been discovered. In this method, a nucleic acid molecule is constructed such 
that a nucleotide sequence encoding the peptide to be cyclized is flaaoked on one end with a 
nucleotide sequence encoding the carboxy-traninal portion of a split (or trans) intein (C- 
intein or Ic) and on its other end with a nucleotide sequence encoding the amino-teiminal 
. portion of a split intein (N-intein or In). Expression of the constnict within a host system 
10 snrh as ^ h?>rt^v^ enTcaryotic cell remits in the production of a fusion protein. The' 
two split intein components (Le., Ic and In) of the fusion protein then assemble to form an- 
. active enzyme that splices the amino and carboxy termini together to generate a backbone 



15 
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Fonnation of the active intern fiom the amino and carboxy-tenninal fragments stabilizes the 
ester isomer of an amino acid at the jimction between the N-intern and the peptide to be 
cyclized (in B above, X=S or 0). When R=XH, the heteroatom from the C-intdn is poised 
to attack the ester and generate a cyclic ester intennediate (C). Mtein-catalyzed 
aminosucdnimide formation (D) liberates the cyclic peptide (in the lactone form), which 
spontaneously rearranges to form the thennodynamically &vored backbone (lactam form) 
cyclic peptide product (E). This method can be ad^ted to facilitate the selection or 
screening of cyclic peptides with predetemiined characteristics. 

Accordingly the invention features a non-naturally occurring nucleic acid molecule 
encoding a polypeptide having a first portion of a split intein, a second portion of a sfplit 
intein, and a target peptide interposed between the first portion of a split intein and the 
second portion of a split intein. B^qpression of the nucleic acid molecule in a host system 
produces a polypeptide that sfpontaneously splices in the host system to yield a cyclized 
form of the target peptide, or a splicing intennediate of a cyclized form of the target pqptide 
such as an active intein intermediate, a thioester intermediate, or a lariat intermediate. 

Both the first portion of a split intern and the second portion of a split intein can be 
derived from a natuKdly-<)ccuriiagspUt intein such as SspDi^^ In other yariadons, one 
or both of split intein portions can be derived from non-naturaUy occurring split inteins such 
as those derived fit>m RecA, DnaB, Psp Pol-I, and Pfii inteins. 

In another aspect, the invention features a non-naturally occurring nucleic acid 

molecule encoding a polypeptide having a first portion of a split intein, a second portion of a 

split intein, a third portion of a split intein, and fourth portion of a split intein. This molecule 

can have a first target peptide is interposed between the first portion of a split intein and the 

second portion of a split intein, and a second target peptide is interposed between the third 
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portion of a split intein and the fourth portion of a split intein. The first portion of a split 
intein can be complementary to the third portion of a split intein but not complementary to 
the second portion of a split mtein, and the second portion of a split intein can be 
conq>lementary to the fourth portion of a split intein but not complementary to the ttiird 
portion of a split intein. 

Also within the invention is an expression vector comprising a nucleic acid molecule 
within the inventioxi^ Expression of the vector in a host system produces a polypeptide that 
spontaneously splices in the host system to yield a cyclic peptide or a splicing intermediate. 
The expression vector of the invention can also contain a regulatory sequence that fecilitates 
expression of the polypeptide in titie host system. The nucleic acid molecule of tiie vector 
can include a nucleotide sequence that encodes a peptide that &cilitates screening of the 
cyclized form of tiie target peptide for a particular characteristic and/or a nucleotide 
sequence that encodes a peptide that &cilitates purifying the cyclized form of the target - 
peptide firom the host system. The e3q)Tession vector can also be inducible. 

In mother aspect the invention features an expression vector encoding a 
polypeptide having a target peptide that has a first end fiised to a first portion of a split intein 
and a second end fiised to a second portion of a split intein. Expression vectors of the > 
invention can be a plasmid, a bacteriophage, a virus, a linear nucleic add molecule, or other 
type of vector. 

The invention additionally features a substantially pure polypeptide having a first 

portion of a spHt intein, an second portion of a split intein, and a target peptide interposed 

between the first portion of a split intein and the second portion of a split intein. The 

polypeptide can be one that spontaneously splices in the host system to yield a cyclized 

form of tiie target peptide, or it can be a spHcing intermediate. ^ 
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Also wifhin the invention is a host system harboring a nucleic acid molecule of the 
invention. The host system can be a prokaiyote such as a bacterium, an archaebacterium, 
a eukaiyote such as ayeast or amammatlancell, aplant cell, an in vitro 
transcription/translation system, or a cell lysate. 
5 In another aspect, the inviention features a method for maldng a pq)tide molecule. 

This method includes the steps of: providing an isolated nucleic add molecule of the 
invention; providing a host system; introducing the isolated nucleic acid molecule into the 
host system; and expressrug the isolated nucleic add molecule. In one variation, the step of 
expressing the isolated nucleic acid molecule results in production of a polypeptide that 

10 spontaneously spHces in the host system to yield the cyclized form of the target peptide. 
This method can also feature the step of purifymg the cyclized form of the target pqptide 
firom the host system. 

In another variation of this method, the step of expressing the isolated nucleic add 
molecule results in production of a splidng intermediate of a cyclized form of the target 

15 peptide. This method can also feature the step of puri^dng the spEdiig intern 

cyclized form of the target peptide fibom the host system. Yet another variation of this 
method, includes flie step of forming the cyclic ipeptide fiom the splicing intermediate. 

In another aspect of this method, the target peptide is produced in a cyclized form 
in the host system in the absence of an exogenously-added agent sudi as a protease or a 

20 thioL 

Another aspect of the invention is a method of prepaiiag a library of peptide 
molecules. This method involves the steps of providing a plurality of nucleic acid molecules 
encoding a plurality of target peptides having heterogenous amino acid sequences; 

/ 

incorporating each of the plurality of nucldc add molecules into an esqpression vector to 
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fonn a plurality of e3q>xession vectors, and e3^ressmg the expression vectors in the host 
system. The plurality of nucleic add molecules is interposed between a nucleic add 
molecule encoding a first portion of a split intein and a nucldc acid molecule encoding an 
second portion of a split intein in each of the formed e3q>ression vectors such that 
expression of the expression vectors in a host system results in the production of a plurality 
of peptide molecules such as polypeptides that spontaneously splice in the host system to 
yield cyclized foniis of the target peptides, or spUdng intennediates of cyclized fomis of flie 
target peptides 

And in yet anotiier aspect, the invention includes a mettiod of screening a peptide 
. molecule for apredetennined characteristic. This method includes the steps of: providing a 
nucldc add molecule that encodes a polypeptide conqxrising a first portion of a split intein, 
a second portion of a split intein, and a target peptide interposed between the first portion 
;of a split intein and the second portion of a split mtein; providing the host system; 
introducing the isolated nucldc add mplecule^in the host system; placing the host system 
under conditions tiiat cause the peptide molecule to be produce(^ and testing the peptide 
molecule for the predetermined characteristic. E3q)ression of the nucldc acid molecule in a 
host system produces dther a cyclized form of the target peptide resulting fit)m 
spontaneously spUcing of the polypeptide in the host system, or a splidng intermediate of a 
cyclized finm of the target peptide. 

In one variation of this rnethod, the predetermined characteristic includes the ability 

to specifically bind a target molecule, and the step of testing the peptide molecule for the 

predetermined characteristic includes the steps of (a) contacting the peptide molecule to the 

target molecule and (b) determining whether the peptide molecule binds to the target 

molecule. In another variation, tiie predetermined characteristic is the ability to modulate a 
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biochemical reaction, and the step of testing the peptide molecule for the predet^mined 
characteristic comprises the steps of (a) contacting the peptide molecule to a sj^em 
containing the biochemical reaction and (b) determining whether the peptide molecule 
modulates the biochemical reaction. The step of determining whether the peptide molecule 
5 binds to a target molecule or modulates a biochemical reaction can measured by observing 
a color change, a fluorescent signal^ by analyzing the cell cycle or the reproduction of an 
•organism. 

The target molecule in these methods can be a cell-associated molecule such as a 
membrane*assbciated molecule or an intracellular molecule (e.g., a nuclear molecule or one 
10 ' or more orgaiiefles such as niitochondria,lysosomes, endoplasmic reticula,<M 
golgi, and periplasm). It can also be an ^ctracellular molecule. 

The biochemical reaction can be a cell associated-process such as an intracellular 
■ metabolic event, a membrane-associated event, a nuclear event It can also be an - 
^extracellular reaction. 

15 In these mefliods^ the step of testing the peptide molecule for the predetermined 

characteristic can be performed using a hybrid system, and/or the step of immobilizing the 
peptide molecule on a solid phase support 

The invention also features a method for purifying a cyclic peptide from a mixture. 
This method includes the steps of: providing a mixture containmg a splidng intermediate 

20 conjugated with an affinity tag; mixing the conjugated splicing intermediate with a solid 
phase support having a ligand thereon that specifically binds the afSnity tag such that the 
support becomes specifically bound with the splicing intermediate; washing the support to 

remove non-specifically bound matter ftom the si5)port; adding to tibie support a reagent 

/ 

that makes a cyclic peptide fix)m the splicing intermediate; and eluting the cyclic peptide 
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firom the support 

In a variation of the foregoing, the invention also includes a method for puiiftdng a 
cyclic peptide from a mixture that includes the steps of: providing a mixture containing a 
splicing intermediate conjugated with an afiBnity tag; mixing the conjugated splicing 
intermediate witii a solid phase support having a ligand thereon that specifically binds the 
aflBuoity tag such that the support becomes specifically bound witii the splicing inteamediate; 
washing the si^ort to remove non-spedfically bound matter firah the support; eluting the 
splicing intermediate &om the support; and adding a reagent the eluted splicing intermediate 
that niake a cycHc peptide fiiom the spHcing intemie^ 

Additionally, mcluded m the invention is method for purifying a target molecule that 
binds a spUcmg intermediate from a mixture. This mefliod includes the steps of : providing a 
solid phase support having tiie splicing intermediate specifically bound thereon; contacting 
tiie support witii the target molecule in the mixture; washing the siqiport to remove non- 
specifically bound matter ftxrai the support and eluting the target molecule firom the 
•stqiport. 

As used herein, the phrase '*non-naturally occurring'* means being directiy or 
indirectiy made or caused to be made through human actioiL Thus, a non-naturally 
occurring nucleic acid molecule is one that has been produced througih human manipulation, 
and not natural evolutionary processes. 

By the phrase **nucleic acid molecule*' is meant any chain of two or more 

nucleotides bonded in sequence. For example, a nucleic acid molecule can be a DNA or 
anRNA. 

As used herein, the term **peptide'* means a chain of two or more amino acids 

bonded in sequence, and includes polypeptides and proteins. By '^polypeptide" is meant 

SUBSTITUTE SHEET (RULE 26) 



wo 00/36093 




PCTAJS99/30162 



12 



a polymer coniprised of two or mote peptides, regardless of length or post-translatioiLal 
modification. By '^protein'* is meant any chain of amino acids and includes peptides, 
polypeptides, proteins, and modified proteins such as glycoproteins, lipoproteins, 
phosphoproteins, metalloproteins, and the like. 
5 A ^linear peptide" is a peptide that is not in a circular form, and generally has both a 

• > carboxy-terminal amino acid with a free carboxy-tenninus and an anuno-tenninal amino 
. add with a free amino temiinus; 

In comparison, a "cyclic peptide'* is a peptide that has been "cyclized." The term 
**cycHc'' means having coiistituent atoms forming a rm When refemng to a peptide, the 
1 0 tenn "cyclize" means to niake the peptide into a cyclic or "cyclized" form. Thus, for 

example, a linear peptide is "cyclized" when its free amino-tetminus is covalently bonded to 
its free caiboxy-temiinus (i.e., in a head to tail format) such that no free carboxy- or amino- • 
tenxiinus remains in the peptide. 

As used herein, a "splicing intermediate" is a polypeptide generated during the 
15 intein-mediated cyclization reaction illustrated above prior to the formation of the liberated 
cyclic peptide product. Sphciiigintennediates include "active-ixxtein intermediates'' (i.e., - 
those wifli a chemical structure similar to the polypeptide labeled "A" in the above 
. illustration), "thioester intemiediates" ( Le., those with a chemical stractiire similar to the 
polypeptide labeled **B" in the above illustration), and •Tariat intermediates" (i.e., those wifli 
20 • a chemical structure similar to the polypq)tidelabded"C' in 

By tiie phrase *target peptide" is meant a peptide to be cyclized or diq)layed in a 
splicing intermediate. For exaniple, a peptide interposed between a carboxy-terminal 

portion of a split intein and an amino-terminal portion of a split intein in a precursor protein 

/ 

would be a target peptide, if the peptide becomes cyclized upon splidng of tiie precursor 
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protein or becomes a part of a splicing intermediate upon processing (e.g., folding) of the 
precursor protein. 

As used herein, the word "intein" means a naturally-occurring or artificially- 
constcucted polypeptide sequence embedded witinn a precursor protein that can catalyze a 
splicing reaction during post-translation processing of the protein A list of known inteins is 
published at http://www.neb.coni^teins-html. A "split intein" is an intein that has twb or 
more separate corbponents not &sed to one anotiier. 

As used herein, the word "interposed " means placedin between. Thus, in a 
polypeptide having a first peptide interposed between a second and a third peptide, the 
chain of amino acids making up the first peptide is physically located in between the chain 
of amiao acids makuignp the second peptide and the chain of amino adds inakuig up the 
third peptide. 

A plurality of peptides having "heterogenous amino add sequences" means that the 
plurality of peptides is composed of at least two, but generally a large number of, different 
peptides of disparate amino add sequence. 

As used herein, the phrase *liost system" refers to any medium or vehicle in which a 
nucldc add molecule can be transcribed, replicated, and/or translated; and/or any medium 
or vehicle in which a polypeptide can be spliced or otherwise post-translationally 
processed 

As used herein, the word "spontaneously" means tiie action described occurs 

without the addition of an exogenous substance. For example, a precursor polypeptide 

within a host system spontaneously splices in the ho§t system to yield a cyclic peptide when 

nothing is added to the host system other than the precursor polypeptide or a nucldc add 

molecule encoding the precursor polypeptide. In conqparison, a precursor polypeptide 
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within a host systCTi does not spontaneously splice in the host system if an agent extraneous 
to the host system is required to generate the cyclic peptide. 

As used herein, tibe term •*splice" or "splices'' means to excise a central portion of 
the polypeptide to form two or more smaller polypeptide molecules. In some cases, 
splicing also includes the st^ of fusing together two or more of the smaller polypeptides to 
form a new polypeptide. 

As used herein, the word **dCTived" means directly or indirectly obtained from, 
isolated from, pmi£ed from, descended from, or otherwise arising from. 

As used herein, the phrase "ejqpression vector'' means a vehicle that fecilitates 
transcription and/or translation of a nucleic acid molecule in a host system. An expression 
vector is **inducible" when adding an exogenous substance to a host system containing the 
expression vector causes the vector to be ejcpressed (e.g., causes a nucleic acid molecule 
within the vector to be transcribed iiito niEtNA)- 

By the phrase "expression of a nucleic acid is meant that the nucleic acid is 
transcribed and/or translated into a polypeptide and/or replicated. 

As used hereiii, the phrase **regulatory sequence'' means a nucleotide se^^ 

which modulates expression (e.g., transcription) of a nucleic acid molecule. For example, 
promoters and enhancers are rpgulatoiy sequences. 

By the temi *fused" is meant covalently bonded to. For example, a first peptide is 
ftised to a second peptide when the two peptides are covalently bonded to each other (e.g., 
via a peptide bond). 

As used herdn an 'Isolated" or "substantially pure" substance is one that has been 
separated from components which naturally accompany it. Typically, a polypeptide is . 

substantially pure when it is at least 50% (e.g., 60%, 70%, 80%, 90%, 95%, and 99%^ by 
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weight free from the other proteins and naturally-occiirring organic molecules with which it 
is naturally associated* 

A '"jprogenitor DNA*' is particular deoxyribonucleic acid from which mutations are 

made or based upon. 

By the phrase 'target molecule" is meant any molecule used to determine the 
binding or ftmctional characteristics of anotiier molecule. 

Herein, *T>ind'* or *1:>inds" means that one molecule recognizes and adheres to 
another molecule in a sample, but does not substantially recognize or adhere to oflier 
molecules in the sample. One molecule "specifically binds" another molecule ifit has a 
bindingafSnity greater than about 10^ to 10^ liters/mole for the other molecule; 

A "cell-assodated process" is one that takes place within a ceU or in ti^ 

vicinity of the cell* 

A •'membrane-associated evenf ' is a cell-associated process that takes place on the 
plasma membrane of a celL '* 

A '*ttuclear evenr is a cell-associated process fliat takes place in the nucleus of a 

ceU. 

In comparison to a cell-associated event,, an "extracellular reaction" is one tiiat 
does not take place within a cell. 

By the phrase '^hybrid system" is meant two-hybrid systems, reverse two-hybrid 

systems, one-hybrid systems, qjlit-hybrid systems, small molecule hybrid systems and all ' 

like systems for identifying interactions between peptides and other molecules (e.g., 

proteins and nucleic acid molecules). For a review of exemplary hybrid systems, see Vidal 

and Legrain, Nucleic Acids Res. 27:919, 1999. 

Unless otherwise defined, all technical terms used h^ein have the same meaning as 
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commoiily UBderstood by one of ordinaiy skill in the art to which this invention belongs. 
Although methods and materials similar or equivalent to those described herein can be used 
in the practice or testing of the present invention, suitable methods and materials are 
described below. All publications, patent 25)plications, patents, and other references 
5 mentioned herein are incorporated by reference in their eotirety. Ihthecaseof conflict, the 
present specification, including definitions will control. In iaddition, the particular 
... embodiments discussed below are illustrative only and not intended to be limiting* 

Other features and advantages of the invention wiU be q>parent from the following 
detailed description, and fiom the claims. 

10 Brief Description Of The Drawings 

The invention is pointed out mth particularity in the appended cl^^ The above • 
and fiirther advantages of this invention may be better understood by referring to the 
following description taken in conjunction with the acconq)anying drawings, in which: 

Figure 1 is a schematic illustration of an overview of a general cyclization reaction 
IS within the invention. 

Figure 2 is a schematic illustration of at sdies of chCTiical reaction steps that occur 

in a peptide cyclization metiiod of the invention. 

Figure 3 is a genetic niap of (a) plasmid p ARCP, (b) plasinid pARCP-DHFR, 

plasmid pARCPAH-DHFR, (d) a modified vector having a cysteine (TOY) or serine 

20 (TCN) codon generated by cloning into the Mfel site ^ represents any nucleobase, S 

represents C or G and Y represents pyrimidines), ( e) plasmid pARCa?-p, and (f) plasmid 

pARCBD-p. 

Figure 4 is a photograph of a sodium dodecylsulfet€ polyacrylamide gel electro- 
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phoresis (SDS-PAGE) analysis of dihydrofolate reductase (DHFR) cyclization on a 10- 
20% gradient, Tris/glycine ready-gel (Biorad). 

Figure 5 is a gcsqph of DHFR activity of wild-type (triangles) and cyclic DHFR 
(diamonds) activity after preincubation at 65*'C. 

Figure 6 is a sdiematic illustration of the expected endoproteinase Lys-C digestion 
pattern for linear and cyclic DHFR. 

Figure 7 is 'a photogr^h of FeCuY plates used in an in vivo assay to detect 
tyrosinase inhibition by pseudostellarin F. 

Figure 8 is a schematic illustration of ametbod for purifying cyclic peptides within 
the invention. ~ 

Figure 9 is a schematic illustration of another method for purifying cyclic peptides 
within the invention. 

Figure 10 is a schematic illustration of a solid phase support/afi5nity 
chromatography-based method for identifying/purifying molecules which specifically bind 
spHdng intermediate. 

Figure 1 1 is a schematic illustration of another solid phase support/afiBnity 
chroxnatography-based method for identifying^urifymg molecules which ^edfically bind 
splicing intermediate. 

Figure 12 is a schematic illustration of another solid phase support/affinity 
chromatogr^hy-based method for identifying/purifying molecules which spedfically bind 
splicmg mtermediate. 

Figure 13 is a schematic illustration of another solid phase support/aflBnity 

chromatography-based inethod for identifying/puiifying molecules which specifically bind 
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splicing intetmediate. 

Figure 14 is a schematic illustration of the use of aptamer scaffolds in the invention. 

Figure 15 is a schematic illustration of two reactions for preparing aptamers within 
the invention. 

Figure 16 is a schematic illustration of a method for screening withm the invention. 
Figure 17 is a schematic illustration of another method for screening wifliin the 
•invention 

Figure 18 is a schematic illustration of another method for screening witbin the 
invention. ' 

Figure 19 is a schematic illustration of anoflier method for screening within the' 
invention* 

Detailed Description 
The trans-splicing ability of spUt mteins has been exploited to develop a general 
method of producing cycUc peptides and spUcing intermediates displaying peptides in a 
looped conformation. In this method, a taig;etpq)tide is interposed between two porti^ 
of a split intein in a precursor polypeptide. In an^^ropriate host system, the two portions 
of the split intein physically come together to form an active intein in a conformation that 
also forces flie target pqptide into a loop configuratioiL In this configuration, tiie ester 
isomer of the amino acid at the junction between one of the intern portions (e.g., J^d and the 
target peptide is stabihzed such that heteroatom ftom the other portion of the intein (e.g., Ic) 
can then react with the ester to form a cycUc ester intermediate. The active mtem then 

catalyzes the formation of an aminosucdnimide that hljerates a cyclized form of the target 

/ 

peptide (ie., a lactone form), which flien spontaneously rearranges to form flie 
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themodynamically fevored backbone cyclic peptide product (i.e., the lactam form). By 
airestmg the reaction at given points before liberation of flie cyclic peptide, splicing 
intennediates bearing the target peptide in a loop configuration can be produced. To 
produce such peptides, nucleic add molecules encoding a polypeptide having fee target 
peptide sequence interposed between the two intehi portions can be constructed 
Introduction of these constructs into an expression vector provides a method for producing 
the polypeptide in aJhost system, where the polypeptide can be spliced into a cyclic peptide 
or a splidng iatennediate. Using this method, several diflEerent cyclic peptides or splicing 
intermediates can be prepared to generate a library of obelized or pa^ ■ 
peptides that can be screened for particular characteristics. 

Referring to FIG. 1, an overview of an embodiment of the invention includes a 
method of making a cyclic peptide firom a nucleic add molecule. In this method, a nucleic 
acid molecule is prepared so that its nucleotide sequence encode a polypeptide having in . 
consecutive order a first portion of a split intein (e.g., Ic), a pq)tide to be cyclized (i.e., a 
target peptide), and a second portion of a split intein (e.g., Ik). The nucldc acid molecule 
can be incorporated into an expression vector to facilitate its expression in a host system 
where the nucleic acid can be transcribed and translated into a precursor polypeptide 
having tire p^tide to be cyclized interposed between the two spht intein portions. By the 
steps described above, the two portions of the split intein come together and place the 
precursor peptide in a conformation that sets off chemical reactions that ultimately yield a 
cycKc peptide (see FIG. 2). 

Nucleic Acid Molecules 

Nucleic acids molecules witbin the invention include those that encode a 
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polypeptide 

having a first portion of a split intein, a second portion of a split intein, and a target peptide 
positioned in between the first portion of a split iotein and the second portion of a spUt 
iateiiL In one embodiment of the invaition, expression of the nucleic acid molecule in a host 
5 system r^ults in a polypeptide that spontaneously splices in the host system to yield a 

cyclized form of the target peptide. In another embodiment of the invention, expression of 
the nucleic add molecule in a host system results in a polypeptide that is a splicing 
intennediateof a cyclized form of the target peptide. The nucleic acids of the invention can 
. be prq)ared ic<xmling to the methods described herein, an^ 
10 - the guidance provided herein in conjunction with methods for preparing and manipulating 
nucldc add molecules generaUy known in the art (See, e.g.,Ausubelet al9^ 
Protocols in Molecular Biology. New York: John Wiley & Sons, 1997; Sambrook et al., . 
Molecular O nrim p! A laboratory Manual (Z'^.Edition), Cold Spring Harbor Press, 1989). 
. For «ample, a nucleic add molecule within the invention can be made by separately 
15 prqparing a polynucleotide encoding the fibstportiott of a s^^ a polynucleotide - 

encoding the second portion of a spUt intdn, and a polynucleotide encoding the tai% 
. peptide. The three polynucleotides can be Ugated together to fiam a nucldc acid niolecule 
tiiat encodes a polypeptide having the target peptide interposed between the first portion of 
■. a split intein and die second portion of a split intein. 

20 

Nucleic Ac i<^» ^rr^^'^^^ff Tntgins 

Nucleotide seqpiences that encode the first pOTtion of a split intein and the second 

portion of a split intein of the nucleic acid molecules within the invention can be derived 

&om known interns. A Mrly comprehensive and descriptive list of such inteins is published 
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by New Boigland Biolabs at http//wwwjaebxoin/iiitems^t_regJ^^ Any of these known 
interns can be used as long as they are con]5)atible witii invention. 

Nucleotide sequences that encode dther iiaturaUyK>ccurring or artificially-produced 
split inteins can be used to generate the intein portions of nucleic acid molecules within the 
invention. NaturaUy-occurring spKt inteins are expressed in nature as two sq^ 
components that bind one another to form one active splicing agent The nucleic acid 
molecules encoding these naturally-occurring components can thus be used in the invention. 
One example of a naturally-occurring split intein that may be used is Ssp DnaE (Wu et al, 
Proc. NatL Acad. Sci. USA 95:9226,1998). ' ■ ' 

Inteins that are not split in their natural state (i.e., those that exist as one continuous 
chain of amino acids) can be artificially split using known techniques. For exauqjle/ two or 
more nucleic acid molecules encoding difEerent portions of such inteins can be made so that- 
their expression yields two or more artificially split intein components . See, e.g., Evans et 
al, J. Biol. Chem. 274:18359, 1999; Mills et al, Proc. Natl. Acad. ScL USA 95:3543, 
1998. The nucleic acids that encode such non-naturally occurring intein components 
portions) can be used in the invention. Those nucleic add molecules that encode non- 
naturally occurring split intein portions which e^cientty interact on the same precursor 
polypeptide to yield cyclic peptides or splicing intermediates are preferred. Examples of 
non-naturally occurring split inteins fiom which such nucleic acid molecules can be derived 
include Psp Pol-1 (Southworth, M.W., et al. The EMBO J. 17:918, 1998), 
Mycobacterium tuberculosis RecA intein, (Lew, BM., et al, J. BioL Chem. 273:15887, 
1998; Shingledecker, K., et al. Gene 207:187, 1998; Mills, K.V., et al, Proc. Natt. Acad. 
Sci. USA 95:3543, 1998), Ssp DnaB/Mxe GyrA (Evans, T.C. et al, J. Biol. Chem. 

274:18359, 1999), andPfu (Otomo et al. Biochemistry 38:16040, 1999; Yamazaki et^al, 
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L Am. Chem. Soal20:5591, 1998). 

Nucleic acids encoding Target Peptides or Peptides Displayed in Splicing 
Intennediates 

Numerous methods of making nucleic acids ^coding peptides of a known or 
random 

sequence are known in art. For example, polynucleotides having a predetermined or a 
random sequence can be prq)ared chemically by solid phase synthesis using commercially 
available equipment and reagents. Polymerase chain reaction can also be used to prepare 
polynucleotides of known or random sequences. See, e.g,, Ausubel et al, supra. As 
another example, restriction endonucleases can be used to enzymatically digest a larger 
nucleic acid molecule or even whole chromosomal DNA into a plurality of smaller 
polynucleotide fragments that can be used to prepare nucleic acid molecules of the 
invention. 

. Polynucleotides that encode peptide sequences to be cyclized are preferably 
prepared so that one terminus of ttie polynucleotide encodes an asparagine, serine, 
cysteine, or threonine residue to facilitate the cyclization reaction. For the same reason 
polynucleotides that encode peptide sequences for production of splicing intermediates are 
preferably prq>aied so that the terminus encodes an amino acid other than an aj^aragine, 
serine, cysteine, or threonine residue so that the cyclization reaction is prevented. 

Ligation Polynucleotides Encodin g Tntein Portions and Target Pe ptides or Peptides 
Displayed in Splicing Intermediates 

Once generated, conventional methods can be used to ligate nucleic acid molecules 
encoding intein portions to a nucleic acid molecule encoding a target peptide (or peptide 
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within a splicing intermediate) to form a larger nucleic acid molecule encoding a polypeptide 
having the first intein portion-target peptide-second intein portion order. See, e.g., Ausubel 
et al, svpra. 

5 Nucleic Acid Molecules that Encode Multiple Split Inteins an d Multiple Peptides 

Using techniques similar to those described above, one skilled in the art could also 
prepare nucleic acid constructs that encode more than one set of two portions of a split 
intein interposed with peptides. For example, the invention includes nucleic acids 

10 molectdes oicoding a precursor polypeptide molecules comprised of N polypeptides (N = • 
an integer greater than or equal to 1) and having N target peptides interposed between 2N 
t. intein portions such that any target peptide i (i= an integer greater than 1 representing the 
position of an target peptide in the precursor polypeptide) is intraposed between- intein 
portion 2i-l and 2i (e.g. target peptide 1 is between intein portionl & 2, target peptide 2 is 

1 5 between intein portions 3 & 4 etc.). As long as intein portions 2i-l and 2i are not 

complementary (i.e. able to physically interact to catalyze a splicing event), target peptide i 
can not cycUze. If^ however, intein portion 2i is opmplementary with intein portion 2i+l and 
lutein portion 2N is complementary with intein portion 1, the entire ensemble of N 
polypeptides can perform N-1 trans splices (between 2 polypeptides) andl cis splice 

20 (ligating the two ends together) to give rise to a product wherein 1-N target peptides are 
covalently attached to one another in a cyclic peptide/protein (e.g., intein portions 2 & 3 
trans-splice target peptides 1 & 2; intein portions four & five trans-splice target peptides 2 
& 3; intein portions 2N-2 8l 2N-1 trans-splice target peptides N-1 & N; and intein 

/ 

portions N & 1 cis-splice to close the cyclic product containing the N target sequences). 
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The order of trans/cis splicing events is iirelevant The slowest splicing species (whether it is 
the complementary intein portion 2N&1, 2&3 or 80&81) will by default perfomi the cis- 
splice. 

Thus, nucleic acid constructs can be made that express two or more polypeptides 
5 each composed of a target pq>tide interposed between two portions of a split intein where 
the intein components are not complementary (i.e., do not derive from the same tutein or 
odierwise cooperate to catalyze any of the cyclization reactions). In such constructs, no 
one polypeptide could be cyclized unless it was expressed in the presence of a second 
polypeptide haviag the appropriate complementary iatein component Constructs of such 
10 nucleic acids within the invention could encode only one polypeptide per constract or more 
than one polypeptide per constmct (e.g., a bi-fimctional plasmid). 
1. » 

> Expression Vectors 

The expression vectors of the presCTt invention can be prepared by inserting 

1 5 polynucleotides encoding a target peptide into any suitable expression vector that can 

feciUtate expression of the polynucleotide in a host system. Such suitable vectors include 
plasmids, bacteriophages, and viral vectors. A large number of these are known in the art 
and many are coimnercially available or obtainable from the scientific community. Those of 
skill in the art can select suitable vectors for use in a particular application based upon, e.g., 

20 the type of host system selected (e.g., in vitro systems, prokaryotic cells such as bacteria, 
and eukaryotic cells such as yeast or mammalian cells) and the e:q)ression conditions 
selected. 

Expression vectors within the invention can include a stretch of nucleotides that 

/ 

encodes a target polypeptide and a stretch of nucleotides that operate as a regulatory 
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domain that modulates or controls expression (e.g., transcription) of nucleotide sequences 
witiiin the vector. For example, the regulatory domain can be a promoter or an enhancer. 

Expression vectors within the invention can include nucleotide sequences that 
encode a peptide that facilitates screening of the cyclized fprm of the target peptide or 
5 splicing intermediate for a particular characteristic (e,g., an afBnity tag such as a chitin- 

binding domain or a biotin tag; a colored or light-emitting label; a radioactive tag; etc. ), or 
. purifying the cyclized form of the target peptide or splicing intermediate from a host system 
(e.g., an aiSnity tag such as a chitin-binding domain, a biotin tag, a colored or light-emitting 
label; a radioactive tag; etc. ). 
- 10 r In preferred embodiments, the expression vectors within the invention are produced 

with restriction sites both between and within the nucleic acid sequences that encode the 
split intein portions to enable the cloning of a wide variety of cyclization targets or splicing 
intermediates. In some embodiments, an expression vector of the invmtion can be an 
inducible expression vector, such as an arabinose inducible vector. Such vectors can be 

1 5 utilized to control expression of cyclization precursors or splicing intermediates within a host 
system. Otha* vectors can be selected for use in the invention based on their compatibility 
with known bacterial expression strains and hybrid systems. See, e.g., Zhang et al, Curr. 
Biol. 9:417, 1999; PelKtier et al, Nat. Biotechnol. 17:683, 1999; Karimova et al, Proc. 
. Natl. Acad. Sci. USA 95:5752, 1998; Dmitrova et al, Mol. Gen. Genet, 257:205, 1998; 

20 Xu et al, Proc. Natl. Acad. Sci. USA 96:151, 1999; Rossi et al, Proc. Natl. Acad. Sci. 
USA 94:8405, 1997. 



Polypeptides 

Polypeptides within the invention include any that can be produced by e:q>ression of 
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a 

nucleic acid of the iavention. For example, a substantially pure precursor polypeptide that 
has a taiget peptide (or a peptide to be displayed by a splicing intermediate) interposed 
between the first portion of a split intein and the second portion of a spUt intein is included in 
S the invention. In some embodiments of the precursor polypq)tide, the target peptide may 
. ^ be direcfly fused to tiie jBrst and second intein portions. The precursor polypeptide 

spontaneously splices in the host system to yield a cyclized form of the target peptide (or a 
splicing intermediate displaying a peptide). 

Cyclized forms of target peptides and splicing inter-mediates displaying peptides 
10 are also within the invention. Preferably, these are produced by splicing of a precursor • 
polypeptide of the invention. The cyclized forms of target pqptides can be of any amino 
acid sequence that can be cyclized by the methods of the invention The slicing 
intermediate can be an active intein intermediate, a thioester intemiediate, or a laiiat 
intermediate, and can display a peptide of any compatible amino acid sequence. 

15 

TTngf 5^YRtemR 

Hosts systems that may be used in the inviention include any systems that 
support transcription, translation, and/or replication of a nucleic acid molecule of the 
invention; or that support post-translational modification (e.g., spHcing) of a polypeptide or 
20 protein of ihe invention. Numerous such hosts systems are known. For example, in the 
invention, especially when it is desired to avoid artifacts or interference caused by living 
host systems, tiie host system can take the form of an in vitro transcription/translation 
system. Such systems can be fabricated in the laboratory according to published 

/ 

techniques or can be commercially purchased. For instance, STP2-T7 (cat. No. 69950-3) 
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and STP-SP6 (cat. No. 69997-3) are available ftom Novagen (Madison, WI). Promega 
(Madison, WI) also sells such systems (e.g., cat. Nos. LI 170, L2080, L4600, L4610, 
L4130, L4140, L1130, L1020, and L1030), as does Stratagene (La JoUa, CA) which 
markets a system branded IN VTTRO EXPRESS (cat. No. 200360). Non-living host 
systems for use in the invention can also be derived from a living organism. For example, a 
cell lysate such as a reticulocyte lysate can be used in some applications. 

Host systems can also take the form of living organisms. Living organisms are 
preferred for host systems because they can usually be reproduced in numerous copies 
thereby providing a continuous, readily-e3q)andible, and easily-manipulated' source of - 
selected nucleic acid molecules. Living organisnis that can be used as host systems within 
the invention include prokaiyotes such as bacteria (e.g., Escherichia coli) andeukaryotes 
such as yeasts and mammalian (e.g., human, murine, bovine, ovine, porcine, etc.) cells. 
i Archaebacteria, plant cells, and any other organism suitable for use with the methods of the 
invention can also function as the host system. 

The particular host system best suited for a particular ^iplication will vary 
depending on the many different factors. One of skill in the art, however, should be able to 
select a suitable host system for a particular appUcation based on known applications of the 
dififeient host systems. For example, where large scale production of a cyclic peptide is 
desired, a bacterial host or an insect host would be suitable. As another example, where it 
is desired to analyze the interaction of human cell components, using a human cell as the 
host system would Ukely be more appropriate than using a bacterial system. 

Method of Making a Polvpeptide. Cvclic Peptide, or a Sphcing Intermediate 

The polypeptides of the invention can be prepared by conventional methods of 
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producing polypeptides of a kaown amino acid sequence. For example, polypeptides 
within the invention can be made by solid phase synthesis using conunercially available 
equipment and reagents. Ejiowo, in vitro methods of producing cyclic peptides can also 
: be used to produce cyclic peptides. In many cases, however, the polypeptides of the 
5 invention are preferably produced by expressing nucleic acid molecules eacoding them in a 
host system. For example, nucleic acid molecules within the invention can be incorporated 
into an expression vector and then introduced into a host system. The host system can then 
be placed under conditions that cause the vector to be expressed, resulting in the formation 
of a precursor peptide and subsequentiy a cyclized form of the target peptide or a splicing 
10 intermediate displaying the peptide. 

A prefrared method for making a cyclic peptide or a splicing intmnediate includes 
. the steps of: (a) providing an isolated nucleic acid molecule that encodes a polypeptide • 
; . having a target peptide interposed between the first portion of a split intern and the second 
portion of a split intein; (b) providing a host system; (c) introducing the isolated nucleic- acid 
15 molecule into the host system; and (d) expressing the isolated nucleic acid molecule. 

Expression of the nucleic acid molecule in the host system produces the peptide molecule in 
the form of a splicing intermediate of a cyclized form of the target peptide, or a polypeptide 
that spontaneously splices to yield a cyclized form of the target peptide. 

In preferred embodiments of this method^ production of the polypeptides^ cyclic 
20 peptides, or splicing intermediates takes place in vivo (e.g., with a living host system) and in 
the absence of any exogenously-added agent, such as an agoat to catalyze cyclization of a 
peptide (e.g., a protease or a thiol). 

Production of polypeptides, cyclic peptides, or splicing intermediates can be 

/ 

monitored using standard techniques for characterizing proteins. See, e.g., Sambrook et al, 
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supra. Exemplary techniques that can be used include conventional chromatography, 
HPLC, FPLC and the like; electrophoresis such as sodium Ao6scy\ sulfate polyaciylamide 
gel electrophoresis (SDS/PAGE), 2-dimensional gel electrophoresis; electromagnetic 
radiation-based-spectroscopy, mzss spectroscopy; analysis of enzymatic digestion 
S products; thermostability assays; etc. 

Purification of Polypeptides, Cyclic Peptides, or a Splicing Intermediates of the Invention 

Conventional methods of purifying proteins can be adapted to purify the 
. polypeptides, cyclic peptides, and splicing intermediates of the invention. The invention 
also includes a preferred method for purifying a cyclic peptide firom a mixture. In this 

10 method, an a£5nity tag is attached to the cyclic peptide to aid in its purification. This 

method includes the steps of: (a) providiug a mixture containing a cyclic peptide conjugated 
with an afBnity tag; (b) mixing the conjugated cyclic peptide with a solid phase support 
having a ligand thereon that specifically binds the affinity tag so that the support becomes 
specifically bound with the cyclic peptide; (c) washing the support to remove non- 

15 specifically bound matter; azid (d) eluting the cyclic peptide firom the support 

In this method, the affinity tag can any molecule tiiat can bind a ligand on a solid 
phase support For example, the affinity tag can be a chitin-binding donoain where the 
ligand is chitm (see examples section below) or it can be a biotin tag where the ligand is 
streptavidin. Many other affinity tag-ligand pairs are known and can be used in the 

20 invention. Because the afBnity tag specifically binds the ligand on the solid phase support, 

the cyclic pq>tides (with the attached afBnity tag) will specifically bind the support. The 

support can then be washed with a buffer (e.g. a high salt, acid or alkaline buffa:) that 

removes matter within the mixture that is non-specifically bound to the support. The ^ 
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afiBnity-tagged cyclic peptide can fhen be eluted from fhe solid phase support using a buffer 
containing a substance that separates the tag from the ligand (e.g.,a competitive inhibitor 
such as excess imconjugated affinity tag; or a denaturing agent), or an en2yme or chemical 
reactant that cleaves the cycUc peptide from the aflBnity tag. 
5 In an analogous manner, splicing intemiediates rather titian cyclic pqptides can be 

purified. Cyclic peptides can also be purified from a mixture using splicing intemiediates. 
For example, a method for purifying a cyclic pq>tide from a mixture includes the steps of: 
(a) providing a mixture containing a splicing intermediate conjugated with an afBnity tag; (b) 
mixing the conjugated splicing intemiediate witii a solid phase support having a ligand 

1 0 thereon that specifically binds the afSnity tag such that the support becomes specifically 
bound with the splicing intermediate; (c) washing the support to remove non-specifically 
bound matter; (d) adding to the support a reagent that makes a cyclic peptide from the 
splicing intermediate; and (e) eluting the cyclic peptide from the support. In a variation of 
the foregoing, steps (d) and (e) are reversed so that step (d) is eluting the splidng 

1 5 intermediate from the support and stq) (e) is adding to the eluted splicing intermediate a 
reagent &at makes a cyclic peptide from the splicing intermediate. Reagents that may be 
added to make a cyclic peptide from a splicing intemiediate include thiols, proteases, and 
other substances which can catalyze cyclization of the splicing intermediate. 

As a specific example, by fiising Ic to an afOnity tag and removing the essential 

20 asparagine residue (see d in FIG. 3), a cyclic ester can be immobilized on an afGnity 

column. The resulting cyclic peptide column can be used for the afOnity purification of the 
cyclic peptide itself. A wide range of proteolsrtic methods can be employed to Uberate the 

cyclic ester from the aJBSnity tag and Ic depending upon the sequence of the cyclic peptide 

/ 

product. 
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Referring now to FIG. 8, a method for purifying cycKc peptides is shown. In this 
method, an active intein intermediate (specie 1) is mutagenized to replace the catalytic 
asparagine (step A) with a non-catalytic amino acid (Y) and to introduce an affinity tag 
downstream of I^ (step B) to yield specie 2. The intein-mediated cyclization reaction will 
5 proceed until the lariat intemiediate is formed (step C). This molecule is then passed 

. tbrough an afiGnity column (step D) having a solid phase support with a ligand thereon that 
specifically binds the afBnity tag and thus allows retention and purification of the I^/Ic non- 

• coyalent complex (specie 3). The I^/Ic reaction is then dismpted to yield a lariat 
intermediate (specie 4) which can be eluted from the aflSnity column. Proteolytic or 

10 chemical cleavage at aininoadidY (step F)Uberates the lactone intermediate's^ 

A(^l-to-N rearrangement (step G) yields the theraiodynamically preferred amide cyclic 
product (specie 6). 

Referring now to FIG. 9, a variation of tiie foregoing method for purifying cyclic 
I peptides is shown. In this method^ an active intein intermediate (specie 1) is mutagenized 
15 to replace the catalytic asparagiae (step A) with a non-catalytic amino acid (Y) and to 

• introduce an afBnity tag itpstream of Ic (step B) to yield specie 2. The intein-mediated 
cyclization reaction will proceed until a lariaf intamediate is formed (step C). This molecule 
is then passed though an afBnity colimm (step D) having a solid phase support with a ligand 
fliereon that specifically binds the afBnity tag/Ic iQtermediate (specie 3). Separation of the 

20 . afBnity tag (step E) from the ligand (e.g., using a molecule that competitively inhibits the tag- 

ligand interaction, using a high salt buffer or denaturing agent, or using a chemical reagent or 

protease to cleave the tag) allows recovery of the lariat iatermediate (specie 4). Proteolytic 

or chemical cleavage at anuno acid Y (step F) liberates the lactone intermediate (specie 5). 

Acyl-to-N rearrangement (step G) yields the thermodynamically preferred amide cyclic 
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product (specie 6). 



Method for Preparing a Library of Cyclic Peptides and Splicing Intennediates 

Numerous methods of making linear pq>tide libraries are known in the art. 

Modifications of such known methods can be utilized with flie methods of producing cyclic 
S peptides and splicing intennediates taught herein to generate libraries of cyclic peptides and 
. splicing intennediates. In general, a method of preparing a library of cyclic peptides and/or 

splicing intennediates includes the steps of: (a) providing a plurality of nucleic acid 

molecules encoding a plurality of target peptides having heterogenous amino acid 

sequences; (b) incorporating each of the plurality of nucleic acid molecules into an 
10 expression vector to form a plurality of expression vectors, whereby each of the plurality of 

nucleic acid molecules is interposed between a nucleic acid molecule encoding a first 
i portion of a split intein and a nucleic acid molecule encoding an second portioii of a split • " 

intein in each of the formed expression vectors, such that expression of the expression 

vectors in a host system results in the production of a plurality of splicing intennediates of 
15 cyclized forms of the taiget peptides or polypeptides that spontaneously splice in the host 

system to yield cyclized forms of the target peptides; and (c) expressing the expression 

vectors in the host system. 

As more specific examples. Hie methods described in Childs et al, in Sequence 

Specificity iti Tratigcription and Translation (Alan R. Liss, Inc., 1985) and the double strand 
20 ligation method described in Schumacher et al. Science 271:1854^ 1996 can be modified 

for use in the current invention. Known PCR-based methods can also be used to generate 

polynucleotides encoding peptides with random sequences that can be circularized or 

expressed as splicing intennediates in the invention. See, e.g., Caldwell and Joyce, PGR 
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Methods AppL 2:22, 1992; Ostennder et al, Proc. Natl. Acad. Sci. USA 96:3562, 1999; 
the Nested Deletion Protocol and Reagents from Promega; and Stemmer, WJ?. Nature 
370:389, 1994 (DNA shuffling). The plurality of polynucleotides encoding peptides with 
heterogenous sequences can be incorporated as the target peptide (or the peptide to be 
5 displayed in a spliciug intermediate) in the nucleic acid molecules and e7q>ression vectors of 
the invention as described above and then expressed in a host system to make a library of 
cyclic peptides or splicing intemiediates. 

Method for SCTeeniTip; a Cyclic Peptide for a Predetenm ned nharacteri^tic 

Myriad techniques exist for screening small molecules for particular characteristics. 

10 See, e.g., Femades, P., Current Opin. Chem. Biol. 2:597, 1998; Science 286:1759, 1999; 
U.S. Patent Nos. 5,585,277 and 5,989,814. More specifically, methods for determining 
which peptide in a combinatorial peptide library binds specifically to a target protein- are 
also known. E.g., U.S. Patent No. 5,834,318. Many of these methods can be adapted to 
screen cyclic peptides and/or splicing intermediates made with the methods of the invention 

15 for particular characteristics. 

A general method of screening a peptide molecule for a predetermined 
characteristic includes the steps of: (a) providing a nucleic acid molecule that encodes a 
polypeptide having a target peptide interposed between a first portion of a split intein and a 
second portion of a split intein such that expression of the nucleic add molecule in a host ^ 

20 system produces the peptide molecule either as a cyclized form of the target peptide (as a 

result of spontaneously splicing of the polypeptide in the host system) or a splicing 

intermediate of a cyclized form of the target peptide; (b) providing the host system; (c) 

introducing the isolated nucleic acid molecule in the host system; (d) placing the host system 
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under conditions that cause the peptide molecule to be produced; and (e) testing the 
peptide molecule for the predetermined charact^stic. 

Step (a) can be performed as described elsewhere herein by, for example, using 
molecular biology techniques (see, Ausubel et al and Sambrook et al, supra) to produce 
5 poly-nucleotides encoding the target peptide, the first portion of an intein, and the second 
portion of an intein. The resulting three polynucleotides can then be fused (e.g., ligated) 
together to form the nucleic acid molecule. The host system provided can be any of those 
described herein in which the nucleic acid molecule can be expressed (e.g., a bacterium, a 
yeast, a mandmalian cell, etc.). The nucleic acid can be introduced mto the host system by 

10 Imown methods depending on the form of the nucleic acid molecule and the host system 
used. For example, the nucleic acid molecules can be introduced into a cell by 
electroporation, lipofection, using calcium chloride-mediated transformation, using a gene 
"gun," using a bacteriophage vector (when host system is a bacterium), using a plasmid 
constmct, using a viral vector, etc. 

1 5 The host system can be placed under conditions that cause the peptide molecule to 

be produced by adjusting the conditions according to the particular form of the nucleic acid 
molecule and the host system used. For a human cell host system, this can mean placing 
the cell in an appropriate nutrient rich medium and cultuiing the cell in a 37®C, humidified, 
5-10% CO2 incubator. For inducible expression vectors, this can mean adding the 

20 substance to tiie host system that induces expression of the nucleic acid molecule in the 
vector. For example, when using an arabinose-inducible expression vector, this step can 
include adding arabinose to the host system. 

Testing of the peptide molecule for the predetermined characteristic can be 

/ 

performed by a large number of different methods, e.g., measuring binding of the peptide 

SUBSTITUTE SHEET (RULE 26) 



wo 00/36093 




PCT/US99/30162 



35 



molecule to a known ligand and analyzing the ability of the peptide molecule to modulate 
(i.e., increase or dearease the rate o^ a biochemical reaction. For a description of various 
methods that may be used for testing peptides for predetemiined characteristics see, 
Femades, P., supra. 

5 . More specific exemplary methods that may be used to screen cyclic peptides or 

splicing intermediates for a particular characteristic include using a solid phase support and » 
afiEurity chromatography to identify molecules which specifically bind cyclic peptides or 
splicing intermediates; using phage display technology; and using aptamer peptide fusion 
constructs and/or hybrid systems to identify cyclic peptides or splicing intermediates that 
10 can modulate a specific biochemical reaction or intracellular event 



Solid Phase Supports/ Affinity Chromatography For Identifying Molecules That 
Interact with Cyclic Peptides and/or Splicing Intermediates 

Cyclic peptides or splicing intermediates can be immobilized on a solid phase 

support to fecilitate identification and^or purification of molecules that specifically bind a 

15 given cyclic peptide or splicing intermediate. For a general overview of peptide afiSnity 
columns for purification see Bumbach, GA. and DJ. Hammond, Biophaim:, 5:24, 1992. 
Below are examples of how this can be performed in the invention. 

Referring now to FIG. 10, a method for identifying/purifying molecules that 
specifically bind a given splicing intermediate is shown. In this method, an active intein 

20 intermediate (specie 1) is mutagenized to replace the catalytic asparagine (step A) with a 

non-catalytic amino acid (Y) and to introduce an afiBnity tag upstream of Ic (step B) to yield 
specie 2. The intein-mediated cyclization reaction will proceed until a lariat intermediate is 

formed (step C). This molecule is then passed though an afiBnity colunm (step D) having a 

/ 

solid phase support wilii a ligand thereon that specifically binds die affinity tag/Ic 
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intermediate (specie 3). A solution containing target molecules (i.e., candidates for binding 
the spUcing intermediates) is then passed through the column (step E). Target molecules 
that specifically bind the splicing intermediate are selectively retained within the column. 
These target molecules can be rCTOioved &om the column and biochemically analyzed (e.g., 
S sequenced). 

. Referring now to FIG. 11, another method for identifying/purif/ing molecules 
• which specifically bind a given splicing intermediate is shown. In this methoc^ an active 
intein intermediate (specie 1) is mutagenized to replace the catalytic asparagine (step A) 
with a non-catialytic amino acid (Y) and to introduce an afiOnity tag downstream of 1^ (step 
10 B) to yield specie 2. The intein-mediated cyclization reaction will proceed until a lariat 
intermediate is formed (step C). This molecule is tiien passed though an aflBnity cokmm 
(step D) having a solid phase sitpport with a ligand thereon that specifically binds the afiSnity 
tag to allow the retention and purification of the l^/lc non-covalent conipl« (specie 3), 
Cleavage of the aflBnity tag (step E) allows recovery of the lariat intermediate (specie 4). A 

! 

15 solution containing target molecules (i.e., candidates for binding the splicing intermediates) is 
then passed through the column (step E). Target molecules that specifically bind the 
splicing intermediate are selectively retained within the colunin. '« 

Referring now to FIG. 12, another method for identifyiiig/purifying molecules that 
specifically bind a given splicing intermediate is shown. In this method, an active intein 

20 interaiediate (specie 1) is mutagenized to replace the 1^ nucleophile (step A) witili a non- 
catalytic amino acid (Z) and to introduce an affinity tag downstream of 1^ (step.B) to yield 
an Ic-peptide-lN-tag fiision protem (specie 2). The intein-mediated cyclization reaction will 

produce the fiision protein (step C). This protein is then passed though an aflBnity column 

/ 

(step D) having a solid phase support with a ligand thereon that specifically biods the affinity 
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tag to aUow the retOTtion and puaification of the protein TO (specie 3). A solution 
containing target molecules (i.e., candidates for binding the splicing intennediates) is then 
passed tibffou^ the colunm (step E). Target molecules that specifically bind the splicing 
intermediate are selectively retained within the column. 
5 Referring now to FIG. 13, yet another method for identifying/purifying molecules 

which specifically bind a given splicing intermediate is shown. In this method, an active 
intein intermediate (specie 1) is mutagenized to replace the 1^ nucleophile (step A) with a 
non-catalytic amino acid (Z) and to introduce an afOnity tag upstream of Ic (step B) to yield 
an tag-Icrpeptide-lN fusion protein (specie 2). The intem-mediated cycUzation reaction will 

10 . produce the fiision protein (step C). This protein is then passed though an afBnity TOlumn 

(st^ D) having a solid phase support with a ligand thereon that specifically binds the a£5nity 
tag to allow the retention and purification of the protdn complex (specie 3). A solution 
containing target molecules (i.e., candidates for binding the sphcing intermediates) is then 
passed through the TOlumn (step 1^. Target molecules that specifically bind the sphcing 

15 intermediate are selectively retained within the TOlunm. 

Phage Display 

Methods of screening molecules using phage display are also within the invention. 

Conventional methods using phage display can be modified by using the phage to display 

the cychc peptides and/or sphcing intemiediates of the invention. For exanrple,if ZinFIG. • 

20 L is a phage coat protein and XH=H in Fig. 2, the sphcing reaction will not progress 

beyond the first ester intermediate, thus resulting in the target peptide being displayed as a 

loop. In this mamier, hbraries TOmprising phage particles displaying loop target p^tides 

can be prq>ared and used to pan for molecules that bind the displayed loop. For instance, 
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a target molecule can immobilized on a solid phase support Phage libraries displaying 
diflEerent looped peptides can then be mixed with the support Those phage displaying 
looped peptides that bind the target molecule would be selectively retained on the support 
After elution from the support (e»g., by cleavage of the ester linkage of the phage-displayed 
5 loop peptides with high concentrations ofa potent nucleophile), the amino acid sequencer 
of the looped peptides can be determined by standard molecular biology methods. . 

Aptamers 

Peptides aptamers are polypeptides that contain a coiiformationaUy const^^ * 

targeti 

10 pq>tide region of variable sequence displayed from a scaffold. Since cyclic peptides or 

splicing intermediates can function as aptamers, known methods of analyzmg ^tamers can 
be modified to assist in identifying particular characteristics of the cyclic peptides or splicing 
intermediates. See, e.g., Geyer et al, Proc. Natl. Acad. Sci. USA 96:8567, 1999; 
Caponigro et al, Proc. Natl. Acad. Sci. USA 95:7508, 1998; Mikhail et al, Proc. Natl. 

15 Acad. Sci. USA 95:14266, 1998; Noraian et al. Science 285:591, 1999. 

For example, referring to FIG. 14, cyclic proteins can be used as aptamer scaffolds 
in a technique that allows members of a peptide library to be displiayed as a constrained 
loop between the N-terminus and C-tenninus of the cyclic protein scaffold. As shown in 
/ FIG. 14, an ^tamer library can be expressed as an Ic-scaffold-lN fusion protein (specie 

. 20 1). Procession of the intein-mediated cyclization reaction in vivo (step A) yields an 1^, an 
Ic and a cyclic scaffold protein (specie 2). The aptamer library is displayed in the linker 

region between the N-temiinus and C-terminus. In FIG. 14, N represents any amino acid 

/ 

and the subscripts n and m are any mtegral number equal or greater than 0, and X 
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represents serine, threonine, or cysteine. 

Other examples of using aptamers are also within the invention. Referring to FIG. 
IS, two such methods are described. In reaction I, an active intein intennediate is 
mutagenized to replace the nucleophilic anwno add fix>m ]^ (step A) with a non-catalytic 
5 amino acid (Z). These processes inactivate tiie splicing reaction to yield specie 2. Because 
of the strong interaction between Ic and 1^, this teclinique allows the members of a peptide 
library to be displayed as a constrained loop in the linker region between the two intein 
portions (Target). In reaction II, an active intein intemiediate is mutagenized to replace the . 
catalytic asparagjne (step A) with a non-catalytic amino acid (Y) to yield specie 2. 
10 Progression of the intein-mediated cyclization reaction proceeds in vivo (step B) and arrests 
at the lariat intennediate stage (specie 3), allowing members of a peptide library to be 
displayed as a constrained lactone. 

TTyhrid Systems ' 

The yeast two-hybrid system is a well-studied method for analyzing in vivo protein- 

15 protein interactions. Fields, S. and O. Song, Nature (London) 340:245, 1989. It and 

variations thereof such as one-hybrid systems, three-hybrid systems, reverse two-hybrid 
system, split-hybrid system, alternative n-hybrid sj^stems, small molecule^based hybrid 
systems can be used to analyze the characteristics of cyclic peptides and/or splicing • 
intermediates by ad^ting known methods. See, e.g.. Drees, B. L., Current Opin. Chem. 

20 BioL, 3:64, 1999; Vidal, M., and P. Legrain, Nucleic Acids Research, 27:919, 1999; 

Current Protocols in Molecular Biology, eds., Ausubel, M., et al, Wiley, New York, 

1996; Huang, J. and SX. Schreiber, Proc. Natl. Acad. Sci. (USA) 94:13396, 1997; Yang, 

M., et al. Nucleic Acids Res. 23:1152, 1995; Colas, P., et al. Nature (London), 380:548, 
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1996; Xu, et al, Proc. Natl, Acad. Sd. CUSA) 94:12473, 1997. 

For example, refOTing to FIG, 16, a method of identifying a target protein that 
interacts with a splicing intennediate is within the invention. In this method, an active intein 
intermediate (specie 1) is mutagenized to replace Hie catalj^c asparagine (step A) with a 
5 non-catalytic amino acid (Y) and to introduce a DNA-binding domain downstream of 1^ 
(step B) to yield specie 2. The intein-mediated cyclizadon reaction will proceed mitil the 
lariat intennediate (spede 3) is fonned (step Q. In and Ic form a strong n 
. complex. The resulting lariat intermediate is then co-expressed with a target protein 
'I attached to a DNA-binding domain (step D). Interaction of the lariat intermediate with the 

10 target protein (spede 4) causes activation of a promoter region (step E) leading to 

expression of the reporter gene C**). This method allows identification of target molecules 
able to bind tiie lariat intermediate. This method can be modified such that a known 
molecule (in place of an unknown target protein) is attached to a DNA-binding domain, so 
ttiat lariat intermediates displaying a looped peptide that binds the known molecule can be 

15 identified. 

Referring to FIG. 17, another method for identifying a target protein that interacts 
with a splicing intermediate is described. In this method, an active intein intermediate 
(spede 1) is mutagenized to replace the catalytic asparagine (step A) with a non-catalytic 
amino acid (Y) and to introduce a DNA-binding domain upstream of Ic (step B) to yield 
20 spede 2. The intein-mediated cyclization reaction will proceed until the lariat intermediate 
(spede 3) is formed (step C). This molecule is then co-expressed with a target protein 
attached to a DNA-binding domain (step D). Interaction of the lariat intermediate with the 
target protein (specie 4) causes activation of a promoter region (step E) leading to 

/ 

expression of the reporter gene (*). This method aUows identification of target molecules 
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able to bind the lariat intennediate. This method can be modified such that a known 
molecule (in place of an unknown target protein) is attached to a DNA-binding domain, so 
that lariat intermediates displaying a looped peptide that binds flie known molecule can be 
identified. 

5 Referring now to FIG. 18, anottier method for identifying a target protein that 

int^acts with a splicing intennediate is described. In this method, an active intein 
interaiediate (specie 1) is mutagenized to replace the In nucleophile (step A) with a non- 
. catalytic amino acid (Z) and to introduce a DNA-binding domain (DBD) downstream of I^^ 
(stepB). These processes will inactivate the splicing reaction and will generate an Icr 
10 peptide-Iw-DBD fixsion protran (specie 2). This molecule is then co-expressed with a target 
protein attached to a DNA-binding domain (step C). Interaction ofthefiision protein with 
the target protein (specie 3) causes activation ofa promoter region (step D) leading to • ; 
. expression of the reporter gene (*). This method allows identification of target molecules 
able to bind the fusion protein . This method can be modified such that a known molecule 
15 (in place of an unknown target protein) is attached to a DNA-binding domain, so that fiision * 
proteins displaying a looped peptide that binds the known molecule can be identified. 

Referring to FIG. 1 9, yet another method for identifying a target protein that 
interacts with a splicing intemiediate is described. In this method, an active intein 
intennediate (specie 1) is mutagenized to replace the 1^ nucleophile (step A) with a non- 
20 catalytic amino acid (Z) and to introduce a DNA-binding domain upstream of Ic (step B). 

These processes will inactivate the splicing reaction and will generate a DBD-Ic-peptide-lN • 
fiision protein (specie 2). This xnolecule is fiien co-expressed with a target protein attached 

to a DNA-binding domain (step C). Interaction of the fiision proteia with the target protein 

/ 

(specie 3) in step D causes activation of a promoter region (step E) leading to expression of 
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the reporter gene C^). This method allows identification of target molecules able to bind the 
fusion protein . This method can be modified such that a known molecule (in place of an 
unknown target protein) is attached to a DNA-binding domain, so that fusion proteins 
displaying a looped peptide that binds the known molecule can be identified. 

5 

Targeting Cyclic Peptides and Sphciag Intermediates In Vivo 

The cyclic peptides and splicing intermediates within the invention can be 
specifically targeted to particular cellular locales or for extracellular secretion by using 
modifications of targeting methods known in the art . See, e.g., Wilkinson et al, J. • 

10 Membrane Biol., 155:189, 1997; Komiya et al. The EMBO J., 17:3886, 1998; Kouranov 
and Schnell, J. Biol. Chem. 271:31009, 1996; Bhagwat et al, L BioL Chem., 274:24014, 
1999; Adam, S.A*, Current OpiiL CeU BioL 11:402-406, 1999; GorUch, D., Current Opin. 
Cell BioL 9:412, 1997; Pembertdn et al. Current Opin. CeU BioL 10:292, 1998; - 
Sakaguchi, M,, Current Opin. CeU BioL 8:595, 1997; Folsch et al.. The EMBO J. 

15 17:6508, 1998. For example, various signal peptides can be attached to tiie cyclic 
peptides or splicing intermediates of the invention to cause them to localize to 
predetermined cellular compartments or to be secreted into the extraceUular space after 
translation. In this maimer, the cyclic peptides or splicing intermediates of the invention can 
be targeted to ceUular locales such as mitochondria, lysosomes, endoplasmic reticula, 

20 chloroplasts, golgi, periplasm, the nucleus, the plasma membrane. This method for targetuig 
can also be used in the methods for generating a peptide Ubraries and methods of screening 
such libraries where it is desired to identify molecules tiiat interact or exhibit an activity at 
predetemiined ceUular locations. 
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Examples 



Preparation of Cyclic Dihvdrofolate Reductase flPHFR^ and Cyclic Ps eudostellarin F 
Materials and Methods 

Vector Construction 

5 Tlie gene for the Ssp DnaE N-intein (^) was amplified fiom Ssp 6803 genomic 

DNA with Tag polymerase and primers introducing 5 '-^^^fll and Nsil and S'-Pstl restriction : 
sites. The Ssp DnaE Ic gene was amplified similarly wi& primers intra 
' 3 -TVifel and iSflcl restriction sites. Plasmid pDIMCP resulted fi-om individually cloning the 
intein fiagments into pDlMCT [identical to pDIMC6 (see Ostermeier et al., Proc. Natl; ■ 

10 Acad, Sci. USA 96:3562, 1999) except for conversion of a J?ai«HI restriction site into 
BglH] . An alanine to histidine mutation in flie Ic gene (A35H) was affected by Quick-: 
Change mutagenesis (Stratagene) resulting in pDIMCPAH. Excision of the intein fragments 
as an NcoVPstl digest and ligation into p AR4 [derived fi»m pAR3 erez-Perez, J. aiid J. 
Gutierrez, Gene, 158:141, 1995; American Type Culture Collection (ATCC>#87026) with 

15 a unique JVcol in the multiple cloiiing site] generated pARCP (a in Fig. 3) and p^ 
E. coli DHFR was amplified fiom pET22b-DHFR (Miller, G.P. and S. J. Benkovic, 
Biochemistry 37:6327, 1998) with primers introducing a 5'-Nde\ site followed by (GAC)6 
(encoding six histidine residues) and a 3 -PM site, digested with NdeVPstl and ligated into 
NdeVNsiL digested pARCP or pARCPAH to produce pARCP-DHFR (b in Fig. 3) and 

20 pARCPAH-DHFR ( c in Fig. 3). A polyhistidine sequence was prepared synthetically with 

Ndely Nsil and BspMl sites, and ligated into pARCPAH to produce plasmid pARCP2-6H 

which encodes cyclo-[CHMHHHHHHGAGAA]. Plasmid pARCP-p was produced in 

three steps from pDIMCPAH: (1) Quick-Change mutagenesis introduced mAflU site into 

In, generating pDIMCPMA; (2) the pseudostellarin F gene was synthetically prepared and 
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ligated into MfeVAflU digested pDIMCPMA to produce pDIMCP-p; and (3) the fusion 
construct was excised from pDIMCP-p as an NcoVPstl fragment and ligated into NcoVPstl 
digested pAR4 to produce pARCP-p (e in Fig. 3). To produce plasmid pARCBD-p, a 
Kpnl site was introduced at the carboxyl terminus of the In gene of pARCP-p by Quick- 
5 Change mutagenesis to produce pARCPpK. The gene encoding the chitin binding domain 
was amplified from plasmid pCYBI (New England Biolabs, Inc., Beverly, Massachusetts) 
with primers introducing a 5' Kpnl site and a 3' HindBlX site. Both the PGR product and 
pARCPpK were digested with Kpnl and /ft/idlll and Ugated together to generate 
pARCBD-p (fin Fig. 3). All enrymes were from Promega or New England Biolabs unless 
10 otherwise rioted. 



DHFR purification 

. XL I -Blue cells harboring either pARCP-DHFR or pARCPAH-DHFR were 
grown in LB medium plus 50 ug/ml chloramphenicol at ZTC until the culture reached an 
ODgoo Of 0.7. The culture was induced with L-(+)-arabinose to a final concentration of • 

15 0.5% and grown at IV'C for 24 hours. Cells were harvested by ceatrifugation (7,000 x g, 
10 minutes) and frozen in liquid nitrogen. The cells were lysed, and DHFR containing 
proteins were purified as described (Miller and Benkovic, id). The cyclic product was 
separated from other DHFR-containing intermediates by FPLC using a Mono-Q column 
(Amersham Pharmacia) eluted with a gradient of 0 -1 M NaCl in 50 mM Tris-HQ over 30 

20 minutes. Western blotting was performed with anti-His (Qiagen) and goat anti-mouse- 
alkaline phosphatase-conjugated antibodies (Pierce) according to the manufacturers' • 
instructions. 
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Endoproteinase Lvs-C Digestion 
Wild-type or cyclic DHFR (50 ug) was treated with 0.5 ug of endoproteinase Lys- 
C in 0.1 M NH4HCO3 at 37°C. Samples were taken at 6 and 24 hours, visualized on a 
SDS/16% PAGE gel and submitted for matrix assisted, laser desoiption ionization 
5 (MALDI) time-of-fiight mass spectrometry (Moore, W. T., Methods EnzymoL, 289:520, 
1997). 

DHFR assays 

Thermostability was assayed by preincubation of 1 00 iiM wild type or cyclic 
DHFR at either 25°C or 65°C in MTEN buffer [50 mM 2-(N-mojpholino)elhanesulfonic 
10 acid (MES), 25 mM trisOiydroxymethyl)aminomethane (Tris), 25 mM ethanolamine and 
100 mM NaCl]. Aliquots were taken at various tune points and equilibrated to room 
temperature for five minutes in the presence of 100 uM 7, S-dihydrofolate* Activity assays 
were initiated with reduced nicotinamide adenine dinucleotide (NADPH) as previously 
described (Miller and Benkovic, ^j?ra). 

15 Svnthesis of cvclo-rSer-Glv>-Glv-Tvr -Leii^PrQ-Pro-Leul 

To a solution of 3.5 mg (4 umol) of NH2-Ser-Gly-Gly-Tyr-Leu-Pro-Pro-Leu- 
CO2H and 1.8 mg (16 umol) of J\r-hydroxysuccinimide in 20 ml of dimethylformamide was 
added 3.0 mg (16 umol) of 1 -ethyl-3-(3Hiimethylaminopropyl)carbodiiniide (EDC). The 
reaction was stkred for 10 hours at 25**C. An additional 3.0 mg of EDC was then added 

20 and stirring was continued at 25''C for another 10 hours. The solvent was removed by 

/ 

rotary evaporation and the residue was dissolved in 2 ml of water for purification by 
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reversed-phase HPLC on a Whatman Partisil 10 ODS-3 9.4-inM X 50-cm colunm eluted 
witii a linear gradient of 0-50% (vol/vol) acetonitrile in 0. 1% trifluoroacetic acid/water over 
30 minutes. The appropriate fractions were lyophilized to yield 2.8 mg (80%) of a white 
solid, [m/z 785 (MH^]. ^H-NMR and UV-visible spectra of the synthetically prepared 
. 5 material were consistent with published spectra for the isolated natural product (Morita, H., 
et al. Tetrahedron 50:9975, 1994). 

Pseudostellarin F Purification 
E. co/i strains XLl-Blue, DH5a or BL21-DE3 harboring pARCP-p were grown 
and harvested as described for DHFR purification. The media (500 ml) was extracted 

10 three times with 1-butanol (3 X 100 ml). The extracts were combined and evaporated, and 
the solid residue was resuspended in 2 ml 0.1 M K2HPO4 (pH 8.0; lysis buffer). Cells were 
. resuspended in 1 0 ml of lysis buffer, sonicated, and clarified by centrifiigation (20,000 x g, 
20ininutes). The lysate was extracted (3 x 5 ml of«-butanol), and extracts were 
combined, evaporated and resuspended in 500 ul of lysis buffer. The recombinant product 

15 was purified &om lysate and media extracts by HPLC as described above. Lyophilization 
of the appropriate fractions firom the lysate and media extractions yielded an oily residue, 
m/z 785.47 (NOT), 807.43 (MNa^ and 823.44 (MK^). *H-NMR and UV-visible 
spectra of the recombinant material were consistent with pubUshed spectra for the isolated 
natural product (Morita et al, supra). Protems fused to the chitin bmding domain were 

20 prepared as described above through generation of the clarified lysate. The lysate was 

passed over a cMtin column (New England Biolabs, Inc.) equiUbrated with ly^^ The 

column was eluted isocratically, and fractions containing splicing intermediates were pooled 

/ 

and submitted for MALDI mass spectral analysis. 
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Tyrosinas e (;^.1npinf;r 



The tyrosinase gene (including ORF 438) from Streptomyces antibioticus 
(Becnan et al. Gene 37:101, 1985) was amplified with Vent -polymerase fiom pIJ702 
(ATCC no. 35287) with primers introducing 5' Nde\ and 3' £coRI restriction sites. The 
5 PGR product was digested with Ndel and EcoBl and ligated into similarly digested 

pDIMN2 (Ostermeier et al, supra) to generate pDIMN-Y. Transformed ligation mixtures 
were grown at ambient temperature for 5 days, and colonies that expressed tyrosinase 
were identified by pigment formation on FeCuY plates [LB agar plates containing ampicillin 
(200 ug/ml); Fed^ ■ fiHjO (0.2 mM), CuSO^ • 5H2O (0.2 mM), L-tyrosine (0;3 mg^al, Y) 
10 and isopropyl-B-D-galactoside (1 mM)] (Della-Cioppa, G. et al.. Biotechnology 8:634, 
1990). 

I 

Results 

Design of Genetic Constructs 
The genes encoding Ssp Ic and were amplified from Ssp genomic DNA by 
1 S . standard molecular biology methods (Sambrook, et al, supra) and serially hgated into 

pDIMCT. The resulting cyclization precursor (CP) fragment was excised from pDIMC7 
and cloned adjacent to the AraB promoter of pAR3 to generate the pARCP vector series 
(Fig. 3). These vectors activate the expression of cyclization precursors in the presence of 
arabinose. The E. coH DHFR gene was cloned between the Ndel and NsA sites of 
20 pARCP to create an in frame fiision with each of the split intein genes (b and c in Fig. 3). 
The PCR primer used to amplify DHFR also introduced a sequence encoding a six- 

histidine tag at the 5' end of the DHFR gene to allow immunodetection of the region to be 

/ 

cyclized. Two DHFR constructs w^e assembled in ord^ to investigate the role of &e 
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penultimate residue of Ic in acid/base catalysis of asparagine side chain cyclization« Plasmid 
pARCP-DHFR (b in Fig. 3) encodes wild type Ic, which has an alanine residue 
neighboring the terminal asparagnae. Plasmid pARCPAH-DHFR ( c in Fig 3) incorporates 
an alanine-to-histidine mutation at the penultimate position in Ic gene. To produce 
5 pseudostellarin F (cyclo-[SGGYLPPL]), the vector was modified by silent mutation to 
create an Ajm site at the 5'-end of the 1^ gene (d in Fig. 3). An Mfel site occurs naturally 
. at the 3'-end of the wild-^e Ic gene. Ligation of a synthetically prepared, d&uble- 
.stranded insert encoding pseudostellarin F into the modified vector produced plasmid 
pARCP-p (e in Fig. 3). A Kpnl site was introduced at the 3'-end of the 1^ gene in order to 
10 fuse tiie gene for the chitin-binding domain to the pseudostellarin-producing constract (fin 
Fig. 3). 

Production and Characterization of Cyclic DHFR 
DHFR cyclization was readily apparent by SDS-P AGE upon arabinose induction 
of pARCP-DHFR as shown in FIG- 4 (F: Ic-DHFR-In fusion protein. T: Ic-DHFR-In 

15 fusion thioester intermediate. R: Ic-DHFR lariat intOTuediate. L: linear DHFR. 0: cycUc 
DHFR. In: N-intein. Ic: C-mtein. Lane 1: uninduced XLl-Blue/pARCP-DHFR. Lane 2: 
arabinose induced XLI-Blue/pARCP-DHFR. Lane 3: arabinose induced XLI- 
Blue/pARCPAH-DHFR. Lane 4: lane 3 crude lysate after methotrexate agarose. Lane 5: 
lane 4 material post FPLC. Lane 6: Wild-type DHFR). 

20 Bands with apparent molecular weights corresponding to the linear (L, 23 kDa) and 

cyclic (0, 21 icDa) DHFR products, the fusion protein (F, 37 kDa), In (14 kDa) and Ic (4 

kDa) were clearly visible, as were bands tentatively assigned as the thioester (T, 36 kDa) 

/ 

and lariat intermediates (R, 26 kDa) (Fig. 4, lane 2). Mutation of the penultimate residue of 
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Ic (A35) fiom alanine to histidine ( c in Fig. 3) improved the yield of cyclic DHFR (Fig. 4, 
lane 3). Methotrexate-agarose affinity chromatography of the crude lysate (Fig. 4, lane 4) 
confirmed that the majority of the induced bands contained correctly folded DHFR. 
Although In is not covalently attached to DHFR, it was retained on the methotrexate 
column presumably due to non-covalent complex formation with the Ic-DHFR lariat 
intermediate (K). The methotrexate-agarose eluant was firactionated by FPLC, allowing: 
purification of 5 nig of the cyclic product per liter of culture (Fig. 4, lane 5). Western . 
blotting (not shown) with an anti-His antibody demonstrated the presence of the 
polyhistidine linker sequence (d in Fig. 3) in flie FPLC-purified protem. The protein ? 
migrated more rapidly in SDS/PAGE analyses than recombinant DHFR (Fig, 4, lane 6) 
despite the ra±ra 1 1-amino acid linker sequence (b in Fig. 3) implying an additional 
topological constraint. Furthermore, no reaction was detected when the FPLC-purified • • 
protein was reacted with phedylisotbiocyanate (Edman, P., Acta Chem. Scand., 4:283, 
. 1 950), suggesting that the amino terminus was unavailable. 

Cychc DHFR had steady-state kinetic parameters and substrate, cofactor and i 
methotrexate dissociation constants which were indistinguishable fit)m the wild type enzyme 
at 25^C. Activity assays conducted after 65^C preincubation of wild type and cyclic 
DHFR indicated that cyclization improved the thdmostability of the enzyme (Fig. 5). 
Endoproteinase Lys-C digestion was used to demonstrate unambiguously that the FPLC 
purified protein was cyclic DHFR. Digestion of the wild ^e enzyme produces amino- 
tOTninal (4.4 kDa) and carboxy-terminal (6.3 kDa) firagments; in a cyclic protein, these two 
fiiBgments would be joined, resulting in a 10.7 kDa digestion product (Fig. 6). The FPLC 
purified material was resistant to proteolysis compared to the wild type enzyme, and mass 

spectral analysis of tbe digestion mixtures identified a 10.7 kDa peak in the product 
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resultmg from the cyclic protein, which was absent in tiie wild type enzyme (data not 
shown). 

Prodnction and Characterization of Pseudostellaiin F 
Pseudostellarin F production was readily detected in vivo through inhibition of 
5 recombinant Streptomyces antibioticus tyrosinase (FIG. 7). In ttie experiment shown in 
HG. 7, XLI-Blue cells were co-transformed with pDIM-N^ 

b) or pARCP-p ( c & d). The cells were plated on FeCuY plates with chloramphenicol 
(50 ug/inl), either without (a & c) or with (b & d) L-(+)arabinose (0.5%). 
Co-expression of pseudosteUarin F in ^TOsinMe expressing ce^ 

10 reduced pigment formation (d in Fig. 7). Expression of an unrelated cyclic peptide from 
pARCP2-6H Sailed to inhibit tyrosinase (a and b in Fig. 7), and inhibition absolutely » 
required arabinose induction (compare c and d in Fig. 7), SDS/PAGE analysis of • 
arabinose-indiicedpARCP-p in several bacterial strains (BL21-DE3,DH5a,m 
Blue) allowed the visualization of bands corresponding to the frision protein (F), thioester 

1 5 intermediate (T) and I>|. An intense, low molecular weight band was also visible, but the 
resolution was insufiScient to separate the lariat intermediate (R) and Ic (data not shown). 
AltihLOugh pseudostellaiin F was too small to be visualized by SDS/PAGE, mass spectral 
analysis indicated its presence in both the crade cell lysate and media. Approximately 30 
ug of the recombinant cyclic peptide was isolated from the cell lysate per gram of wet cell 

20 mass. Pseudostellarin F was also isolated from the media by 1-butanol extraction followed 

by iaPLC with a yield that varied between 2 mg/liter (XL I Bhxe) and 20 mg/liter (BL21- 

DE3) depending on the expression strain. The NMR spectrum of the recombinant material 

was consistent with that reported for the natural product (Morita et al., supra% and the 
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retention time of the bacterially expressed cyclic peptide was identical to a synthetically 
prepared standard. The recombinant material failed to react wilh ninhydrin (see, Gordon, 
AJ., and R.A. Ford, The Chemist's Companion: A Handbook of Practical Data, 
Techniques, and References, Wiley Interscience, New York, 1972), indicating a backbone 
cyclic peptide Oactam) rather than a lactone product Neither HPLC nor mass spectral 
analysis provided any evidence for production of the linear parent peptide. ^ 

A chitin-binding domain was fused to the carboxy-tenninus of 1^ to affinity-purify 
intermediates of the intein-mediated ligation reaction and characterize tiiem by MALDI 
mass spectrometry (Table 1). 

Mass, Da 



Reaction Linear Cyclic Observed ^ 

Component ' ' 



F,T 


24,380.5 


NA 


24,380.4 


In 


19,642.0 


NA 


19,642.3 


R 


4,756.5 


4,738.5 


4,756.2 


Ic 


3,969.2 


3,951.2 


3,953.0; 


Pseudostellarin F 


802.4 


784.4 


784.4 



Table 1: Mass spectral characterization of pseudostellarin F cyclization intermediates 

All of the intermediates of the splicing reaction, including lo were retained when the crude 
cell lysate from arabinose-induced pARCBD-p in XLl-Blue was passed over a chitin 
afiBnity colunm. Pseudostellarin F was recovered from flie unretained material by 1-butanol 
extraction. The observed molecular masses for the ftision protein (F), the thioester 

/ 

intermediate (T) and In were in excellent agreement with the values predicted from the gene 
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sequence. The mass of Ic was consistmt wifh the asparagine-cyclized fonn as predicted 
firom the proposed mechanism of product release. The molecular mass of the lariat 
intermediate (R) was more consistent with the linear Ic -pseudostellarin F fusion product 
than the branched lactone product expected from the transesterification reaction. 

5 Other Embodiments 

While the above spedficalion contains niany specifics^ these shc^ 
. as limitations on the scope of the invention, but rather as examples of preferred embodiments 
th^of. Many other variations arepossible. Accordingly, the scope ofthe invention should 
bedeteominednotby the embodiments illustrated, but by the sqppended claims and their legal 
10 equivalents. 
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What is claimed is: 



1 1 . A non-natuially occurring nucleic acid molecule encoding a polypeptide 

2 comprising a first portion of a split intein, a second portion of a split intein, and a target 

3 peptide interposed between the first portion of a split intein and the second portion of a 

4 split intein; 

5 wherein expression of the nucleic acid molecule in a host system produces the 

6 polypeptide in a fonn selected firom the group consisting of: (a) a polypeptide that 

71 !^ontaneously splices in the host system to yield a cyclized form of the target peptide, and : 

8 (b) a splicing intermediate of a cyclized form of the target peptide. ~ = 

9 2. The non-naturally occurring nucleic acid molecule of claim 1, wherein the 

1 0 polypeptide is a polypeptide that spontaneously splices in the host system to yield a cyclized 

1 1 foxm of the target peptide. 

12 3- The non-naturally occurring nucleic acid molecule of claim 1, wherein the 

13 polypeptide is a splicing intermediate of a cyclized form of the target peptide. 

14 .4. The non-naturally occurring nucleic acid molecule of claim 1 , wherein 

1 5 both the first portion of a split intein and the second portion of a split iatein are derived 

1 6 fiom a naturally-occurring spUt intein. 

17 5. The non-naturally occurring nucleic acid molecule of claim 4, wherein 

18 both the first portion of a split intein and the second portion of a spUt intein are derived from 
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19 SspDnaE. 

20 6. The non-naturally occurring nucleic acid molecule of claim 1, wherein 

21 at least one of the first portion of a split intein and the second portion of a split intein is • 

22 derived firom a non-naturally occurring split intein. 

23 7. The non-naturally occurring nucleic acid molecule* of claim 6, wherein the 

24 non-naturally occurring split intein is derived &om the group consisting of RecA, DnaB» Psp 

25 PoU, andPfuinteins. 

26 8. The non-naturally occurring nucleic acid molecule of claim 1 , wherein 

27 both the first portion of a sfplit intdn and the second portion of a split intein are derived firom- 

28 a non-naturally occiuxing split intein. 

29 9. Thenon-natiiraUy occurring nucleic acid molecule of claims, wherein the 

30 splicing intermediate is an active iatein int^mediate. 

31 10. The non-naturally occurring nucleic acid molecule of claim 3, wh«:ein the 

32 splicing intermediate is a thioester intermediate. 

33 1 1. The non-naturally occurring nucleic acid molecule of claim 3, whereia the 

34 splicing intermediate is a lariat intermediate. 

/ 
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35 12. A non-iiatiirally occurring nucleic acid molecule encoding a polypeptide 

36 comprising a first portion of a split intein, a second portion of a split intein, a third portion 

37 of a split intein, and fourth portion of a split intein, wherein a first target peptide is 

38 interposed between the first portion of a split intein and the second portion of a split intein, 

39 and a second target peptide is interposed between tiie third portion of a split intein and the 

40 fourth portion of a spUt intern. 

41 13. The non^-naturally occurring nucleic acid molecule of claim 12 wherein the 

42 first portion of a split intein is complemCTtary to the third portion of a split, intein but not • 

43 complementary to the second portion of a spUt intein, and the second portion of a split intein • 

44 is complementary to the fourth portion of a split intein but not con^lementary to the third 

45 portion of a spUt intein. 
46 

47 14. An ^q)ression vector comprising a nucleic acid molecule that encodes a 

48 polypeptide comprising a first portion of a split intein, a second portion of a split intein, and 

49 a target pq>tide interposed between the first portion of a spUt intein and the second portion • 

50 of a spUt intein, wherein expression of the nucleic acid molecule hi a host system produces 

51 the polypeptide in a form selected fix>m the groiq> consisting of: (a) a polypeptide that 

52 spontaneously spUces in the host system to yield a cyclized form of the target peptide, and 

53 (b) a spUdng intermediate of a cyclized form of the target peptide. 

54 15. The expression vector of claim 14, wherein the polypqptide is a 

55 polypeptide that spontaneously splices in the host systan to yield a cyclized form of the 

/ 
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56 target peptide. 



57 16. TTie e^ession vector of claim 14, wherein the polypeptide is a splicing 

58 intermediate ofacyclized form of the target peptide. 

59 17. The expression vector of claim 1 5, wherein the nucleic acid molecule 

60 . further comprises a regulatory sequence that j^iUtates expression of 

61 host system. 

62 r 18. The expression vector of claim 14, wherein the nucleic acid molecule 

63 fiirther comprises a nucleotide sequence that encodes a peptide that &icilitates screening of 

64 the cyclizedfonn of the target peptide for a particular characteristic. 

65 19. The expression vector of claim 14, wherem the nucleic acid molecule 

66 further comprises a nucleotide sequence that encodes a peptide that faciUtates purifying the 

67 cyclized fomi of the target peptide fiom the host system. 
68 

69 20. The expression vector of claim 14, wherein the target peptide has a first 

70 end fused to the first portion of a split intein and a second end fused to the second portion 

71 of asplituiteiiL 

72 21 . The e:q>ression vector of claim 14, wherein hoth the first portion of a split 

73 intein and the second portion of a split intein are derived fi:om a naturaUy-^ccurring spUt 

/ 
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74 intein. 

75 22. The expression vector of claim 21, wherein both the first portion of a split 

76 intein and the second portion of a split intein are derived fix)in Ssp DnaE. 



77 23. The expression vector of claim 14, wherein at least one of the first portion 

78 of a split intein and the second portion of a split uiteiQ is derived firom a non-naturally - 

79 occurring split intein. 



80 24. Theexpressionvectorof claim 23, wherein the non-naturaUyoccraring spUt 

81 intein is derived firom the group consisting of RecA, DiiaB, Pq) Pol-I^ 

82 25. The expression vector of claim 14, wherein both the first portion of a split 

83 intein and the second portion of a split intein are derived firom a non-naturally occurring spUt 

84 intein. ' ' 

■» 

85 26. The e3q>ression vector of claim 16, wherein the splicing iutennediate is a 

86 active intein intermediate. 

87 27. The expression vector of claim 16, wherein the splicing ratennediate is a 

88 thioesterintomediate. 

89 28. The expression vector of claim 1 6, wherein the splicing iatennediate is a 
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lariat intermediate. 

29. * The expression vector of claim 14, wherein the host system comprises a 
prokaryotic cell. 

30. The expression vector of claim 29, wherein the prokaryotic cell is a 
..bacterimn. 

31. The expression vector ofclaim 30, wherein the bacterimn. is £src/ierfcA^^^ 

colt 

32. The egression vector of claim 14, wherein the host systan comprises a 
..eukaiyoticcell. . i . 

33 . The expression vector of claim 32 , wherein the eukaryotic cell is a yeast. 

34. The expression vector of claim 33, wherein the eukaryotic ceU is a 
mammalian cell. 

35. The e^qpression vector of claim 14, wherein the host system comprises an 
archaebacterimn. 

36. The e3q)ression vector of claim 14, wherein the host system comprises a 

/ 
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106 plant cell. 



107 37. The e3q>ression vector of claim 14, wherein the vector is a plasmid. 

108 38. The expression vector of claim 14, wherein flie vector is a bacteriophage. 

109 39. The e3q)ression vector of claim 14, wherdn the vector is a virus. 

110 40. The expression vector of claim 14, wherein the vector is a linear nucleic 

111 acid molecule. 

112. 41. A substantiaUy pure polypeptide comprising a first portion of a spUtinte^^ 

113 an second portion of a split intein, and a target peptide interposed between the first |>ortion 

114 of a split intein and the second portion of a split intein, wherein the polypeptide is selected 

115 fiom tiie group consisting of: (a) a polypeptide that spontaneously splices in the host system 

116 to yield a cyclized fomi of the target peptide, and (b) a splicing intermediate of a cyclized 

117 form of the target peptide. 

118 42. The polypeptide of claim 41 , wherein the polypeptide is a polypeptide that 

119 spontaneously splices in the host system to yield a cyclized form of the target peptide. 

1 43. The polypeptide of claim 41, wherein the polypeptide is a splicing 

2 intermediate of a cyclized form of the target peptide. 

/ 
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1 44. The polypeptide of claim 41 , wherein the target peptide has a first end 

2 fused to the first portion of a ^lit intein and a second end fiised to^e second portion of a 

3 split intein. 

4 . 45. The polypeptide of claim 41, wherein both the first jportion of a split intein 

5 and the second portion of a split intein are derived firom a naturally-occurring split intein. 

6 46. The polypeptide ofclaim 45, whei^boto the first portion of a spfitint 

7 and the second portion of a split intein are derived from SspDnaE: 

8 47. The polypeptide of claim 41, wherein at least one of the first portion of a 

9 split intein and the second portion of a split intern is derived from a non-naturally occurring 

10 split intein. 

11 48. The polypeptide of claim 47, whereiu the non-naturally occurring split intein 

12 is derived from the group consisting of RecA, DnaB, Psp Pol-I, and Pfu inteins. 

13 49. The polypeptide of claim 41, wherein both the first portion of a split intein 

14 and the second portion of a spUt intein are derived from a non-natiirally occurring split 

15 intein. 

16 50. The polypeptide of claim 43, wherein the splicing intermediate is a active 

17 intein intermediate. 

/ 



SUBSTITUTE SHEET (RULE 26) 



wo 00/36093 



PCT/US99/30162 



18 
19 

20 
21 

22 
23 
24 
25 
26 
27 
28 
29 

30 
31 

32 
33 
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51. The polypeptide of claim43, wheremthespHcingintennediateisathioester 
intermediate. 

52. The polypq)tide of claim 43, wherein the splicing intermediate is a lariat 
intermediate. 

53. A host system comprising a non-naturally occurring nucleic acid molecule 
encoding a polypeptide comprising a first portion of a split intein, a second portion of a 
split intein, and a target peptide interposed between the jBrst portion of a split intein* and the 
second portion of a split intein; * 

wherein expression of the nucleic acid molecule in the host system produces the 
polypeptide in a form selected fix)m the group consisting of: (a) a polypeptide that 
spontaneously splices in the host system to yield a cyclized form of the target peptide, and 
(b) a splicing intermediate of a cyclized form of the target peptide. 

54. The host system of claim 53, wherein the polypeptide is a polypeptide that 
spontaneously splices in the host system to yield a cyclized form of the target peptide. 

55. The host system of claim 53, wherein the polypeptide is a splidng 
intermediate of a cyclized form of the target peptide. 



34 
35 



56. 

prokaryote. 



The host system of claim 53, wherein the host system con^qndses a 
SUBSTITUTE SHEET (RULE 26) 
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36 

37 
38 

39 
40 

4h 

42 

43 
44 
45 
46 
47 
48 
49 
50 
51 
52 
53 



57. The host system of claim 56, whaiein the prokaiyote is a bacteriiun. 

58. The host system ofclaim 53, wheareinihe host system comprises an 
aichaebacteiimn. 

59. Thehostsystemof claim 53, wherein the host system comprises a 
eukaiyote. 

60. The host ^tem of claim 59, wherein tiie eukaiyote is a yeast. 

61 . The host system of claim 59, wherein the eukaiyote is a mammalian celL 

J . . 62.. ThehostsystCTnofclaim53, wherein the host system conqoises a plant 
ceU. 

63. A method for making a peptide molecule, the method comprising the steps 

of: 

providing an isolated nucleic acid molecule that encodes a polypeptide comprising 
a first portion of a split intein, a second portion of a split intein, and a target pq)tide 
interposed between the first portion of a split intein and the second portion of a spUt intein, 
wherein expression of the nucldc acid molecule in a host system produces the pqptide 
molecule in a form selected &om the groiq) consisting of: (a) a cyclized form of the target 
p^tide resulting Sxtm spontaneously spUcing of the polypeptide in the host Systran, and (b) 
a splidng intermediate of a (^clized form of the target peptide; 
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54 providing the host system; 

55 introducing the isolated nucleic acid molecule into the host system; and 

56 expressing the isolated nucleic acid molecule. 

57 64. The method of claim 63, wherem the step of expressing the isolated nucleic 

58 acid molecule results in production of a polypeptide that spontaneously splices in the host 

59 system to yield the cyclized form ofthe target peptide. 

60 65. The method ofclaim 64 further comprising the step of purify^ 

61 form of the target peptide from the host system- 

62 66. The method of claim 63, wherein the step of expressing the isolated nucleic 

63 acid molecule results in production of a splicing intennediate of a cyclized form of the target 

64 peptide. 

65 67. The method of claim 66 further comprising the step of purifying the splicing 

66 intermediate of a cyclized form of the target peptide from the host system. 

67 68. The method of claim 66, wherein the splicing intermediate is an active 

68 intein intermediate. 

69 69. The method of claim 66, wherdn the splicing intermediate is a thioester 

70 intennediate. 

/ 
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72 

73 
74 

75 
76 
77 

78 

79 

80 

81 
82 

83 

84 



64 

70. The method of claim 66, wherein Has splicing intermediate is a lariat 
intemiediate. 

7 1 . The method of claim 66, further comprising the step of forming the cyclic 
peptide from the splicing intemiediate. 

72 . The method of claim 63, wherein fbe isolated nucleic acid molecule is 
incorporated into an e^qnession vector that &ciUtates expression of the isolated nucleic 
acid molecule in tiie host system. 

73. The method of claim 72, wherein the expression vector is a plasmid. 

74. The method ofclaim 72, wherein the ejqiression vector is a bacteriophage. 

75. The method of claim 72, wherein the e3q)ression vector is a virus. 

76. The method of claim 63, wherein the host system comprises a prokaryotic 

celL 

77. The mefliod of claim 76, wherein the prokaryotic cell is a bacterium. 

78. The method of claim 77, wherein the bacterium is Escherichia coli. 
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85 79. The host system of claim 63, wherein the host system cx)mpm^ an 

86 archaebacterium. 

87 80. The method of claim 63, wherein the host system coDpprises a eukaiyotic 

88 cell. 

89 81 . The method of claim 80, wherein the eukaryotic cell is a yeast. 

90 82. Themethodof claim 80, wherein the eukaryotic ceU is a laiaimn^ 

91 83. The method of claim 63, wherein the host system comprises a plant celL 

92 84. The method ofclaim 63, wherein the host system comprises an in vitro 

93 transcription/translation system. 

94 85. The method ofclaim 84, wherein the in vitro transcription/translation system 

95 comprises a cell lysate. 

96 86. The method of claim 64, wherein the production of the target peptide in 

97 cycUzed form occurs in the host system in the a[bsence of an exogenously-added agent; 

98 87. The method of claim 86, wherein the exogenously-added agent is a 

99 protease. 

SUBSTITUTE SHEET (RULE 26) 



wo 00/36093 PCTAJS99/30162 

66 

100 88. The metliod of claim 86, wherein the exogenonsly-added agent is a fhioL 



101 89^ . The method of claim 72, wherein flie expression vector is inducible. 

1 90. A method ofpreparing a library of peptide molecules, the method 

2 comprising the steps of: 

3 providingaplurahty of nucleic add molecules encoding a plurality of target peptides 

4 having heterogenous amino acid sequences; 

5 incorporating each of the plurality of nucleic acid molecules into an expression 

6 vector to form a plurality of expression vectors, whereby each of the plurality of nucleic 

7 acid molecules is interposed between a nucleic acid molecule encoding a first portion of a 

8 split intein and a nucleic acid molecule encoding an second portion of a split intein in each of 

9 the formed expression vectors, wherein expression ofthe expression vectors in a host 

10 systCTOL results in the production of a plurality of peptide molecules in a form selected from 

11 the group consisting of: (a) polypeptides that spontaneoiisly spUce in the host system to 

12 yield cyclized forms ofthe target peptides, and (b) spUcing intCTnediates of cyclized forms 

13 ofthe target peptides; and 

14 expressing the expression vectors m the host system. 

15 91. The method ofclaim 90, wherem the plurality of polypeptides are 

16 polypeptides that spontaneously splice in the host system to yield cyclized forms ofthe 

17 target peptides. 
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18 92. The method of claim 90, wherein the plurality of polypeptides are splicing 

19 intermediates of cyclized forms of ttie target peptides 

20 93. The method of claim 90, wherein the plurality of nucleic add molecules 

21 encoding a plurality of target peptides are produced by solid phase synthesis. - 

.22 94. The method ofclaim 90, wherdn the plurality ofnucleic add molecules 

23 encoding a plurality of target peptides are produced using polymerase chain reaction. 

24 95. The method ofclaim 90, whereia the pluraKty ofnucleic acid molecules 

25 encoding a plurality of target pq)tides are produced by enzymadcally digesting a larger 

26 :.. nucleic add molecule. 

27 96. The method of claim 95, wherein the larger nucldc acid molecule is 

28 derived jBx)m an organism. 

29 .97. The method of claim 90, whorein the plurality of nucleic acid molecules 

30 encoding a plurality of target peptides are produced &om a progenitor nucldc add 

3 1 molecule that has been amplified under conditions which introduce mutations into the 

32 progenitor nucleic acid molecule's nucleotide sequence. - 

33 98. A method of screening a pqptide molecule for a predetermined 

34 characteristic, the method comprising the steps of: 

/ 
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35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 
46 

47 
48 

49 
50 

51 
52 
53 
54 



providing a nucleic acid molecule that encodes a polypeptide conqnising a first 
portion of a split intein, a second portion of a split intein, and a target peptide intetposed 
between the first portion of a split intein and the second portion of a split intein, wherein 
expression of the nucleic acid molecule in a host system produces the peptide molecule in a 
form selected firom the gjcovtp consisting of: (a) a cyclized form of the target peptide resulting 
&om spontaneously splicing of the polypeptide in the host system, and (b) a splicing 
intermediate of a cyclized form of the target peptide; 

providing the host system; 

introducing the isolated nucleic acid molecule in the host system; 
placing the host system under conditions that cause the peptide molecule to be 
produced; and 

testing the pqptide molecule for the predetermined characteristic. 

99. The method of claim 98, wherein the peptide molecule is a cyclized form of 
the target peptide. 

100. The method of claim 98, wherein the peptide molecule is a splicing 
intermediate of a cyclized form of the target pqptide. 

101 . The method of claim 98, wherein flie predetermined charMteiistic 
conqnises the ability to specifically bind a target molecule, and the st^ of testing the 
peptide molecule for the predetermined characteristic comprises the steps of (a) contacting 
tihie peptide molecule to the target molecule and (b) determining whether the peptide 
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55 molecule binds to the target molecule. 

1 1 02. The method of claim 101, wherein the step of detennining whether the 

2 peptide molecule binds to the target molecule is measured by observing a color change. 

3 103. The method ofclaim 101, wherein the step ofdetermiiiing whether the 

4 peptide molecule binds to the target moleciile is measured by observing a fluorescent 

5 signal. 

1 104. The method ofclaim 101, wherein the step of deterininiag whether the 

2 peptide molecule binds to the target molecule is measured by analy2dng the cell cycle of an 

3 organism. 

4 105. The method ofclaim 101, wherein the step of detemiining whether the 

5 peptide molecule binds to the target molecule is measured by analyzing the reproduction 

6 of an organism. 

1 106 The method ofclaim 101, wherein the target molecule is a cell-associated 

2 molecule. 

3 107. The method ofclaim 106, wherein the cell-associated molecule is a 

4 membrane-associated molecule. 
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1 108. The method of claim 106, wherein the cell-associated molecule is an 

2 intracellular molecule. 
3 

4 1 09. The method of claim 1 08, wherein the intracellular molecule is a nuclear 

5 molecule. 

6 110. The method of claim 108, wherem the intraceUular molecule is an drgm^ 

7 111. The mefliod of claim 1 10, wherem the organelle is selected from the group « 

8 consisting of: mitochondria, lysosomes, endoplasmic reticula, chloroplasts, golgi,' and 

9 periplasm. . • 

1 1 12. The method of claim 101, wherein the target molecule is an extracellular 

2 molecule. 

3 1 13. The method of claim 98, wherein the predetermined characteristic is the 

4 ability to modulate a biochemical reaction, and the step of testtug the peptide molecule for • 

5 the predetemiined characteristic comprises the steps of (a) contacting the peptide molecule 

6 to a system containing the biochemical reaction and (b) determining whether the peptide 

7 molecule modulates the biochemical reaction* 

1 1 14. The method of claim 1 13, wherem the step of determining whether the 

2 peptide molecule modulates the biochemical reaction is measured by observing a color 
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3 change. 

4 115. The method of claim 113, wherein the step of detemiining whethCT the 

5 peptide molecule modulates the biochemical reaction is measured by observing a 

6 fluorescent signal. 

7 116. The method of claim 113, wherein the step of deterDMningwheflier the 

8 peptide molecule modulates the biochemical reaction is measured by analyzing the cell 

9 » cycle of an organism. . . ■ 

1 1 17. The method of claim 1 13, wherein the step of detamming whettier the 

2 peptide molecule modulates the biochemical reaction is measured by analyzing the 

3 . reproduction of an organism, 

4 118. The method of claim 113, wherein the biochemical reaction is an a cell- 

5 associated process.l 19. The method of claim 118, wherein the biochemical^ reaction 

6 is an intracellular metabolic event. 

7 120. The method of claim 118, wherein the biochemical reaction is a membrane- 

8 associated event 

1 121 . The method of claim 118, wherein the biochemical reaction is a nuclear 

2 event 

/ 



SUBSTITUTE SHEET (RULE 26) 



wo 00/36093 PCTAJS99/30162 

72 

122. The method of claim 113, wherein the biochemical reaction is a 
extracellular reaction. 



123. The method of claim 98, wherein the step of testing the peptide molecule 
for the predetermined characteristic is performed using a hybrid system. 

124. The method of claim 98, further comprising the step of immobilizing the 
peptide molecule on a solid phase support 

125. A method for purifying a cycUc peptide from a mixture, the method 
comprising the steps of: 

providing a mixture containing a splicing intermediate conjugated with an a£Gnity tag; 

mixing the conjugated splicing intermediate with a solid phase siq)port having a 
ligand thereon that specifically binds the affinity tag whereby the support becomes 
specifically bound with the splicing intermediate; 

washing tihie support to remove non-specifically bound matter fiom the support; 

adding to the support a reagent that makes a cycUc peptide firom the splicing 
intermediate; and 

eluting the cyclic peptide fiom the support. 

126. A method for purifying a cyclic peptide fiom a mixture, the method 
comprising the steps of: 

providing a mixture containing a splicing intermediate conjugated with an afSnity tag; 

mixing the conjugated splidng intermediate with a soUd phase support having a 

/ 

SUBSTITUTE SHEET (RULE 26) 



wo 00/36093 




PCT/US99/30162 



73 



6 ligand thereon that specifically binds the aflSnity tag whereby the support becomes 

7 specifically bound with the splicing intermediate; 

8 washing the siqpport to remove non-specifically bound matter from the support; 

1 elating the spKcing intermediate from the support; and 

2 adding a reagent the eluted splicing intermediate that make a cyclic peptide from 

3 the splicing intermediate. 
4 

5 127. A method for puriftdng a target molecule that binds a spUcing intermediate 

6 from a mixture^ the method comprising the steps of: 

7 . providing a soUd phase support having the q>Ucing intermediate qjecificaUy 

8 thereon; 

. 1 . contacting the support- with the target molecule in the mixture; 

2 1 washing the support to remove non-specifically bound matter from the support; 

3 and 

4 eluting the target molecule from the support 
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